I used to work in HPC (as evidenced in my posts on AoS vs SoA and AoSoAs), but my interest has always been nature-inspired algorithms, like genetic algorithms.
But there's a nature-inspired algorithm that is getting lots of attention these days: neural networks. People think of them as black boxes, but in this post I'll try to peel back the top and explain what's going on inside.
Neural networks are computational graphs which were originally intended to mimic the behavior of the mammalian brain. They consist of computational nodes - akin to biological neurons - and edges which represent synaptic connections between neurons. The original goal was to better understand the workings of the brain by attempting to create an artificial version.
Artificial neurons, or nodes, are rather simple computational units. They perform one of the most common operations in mathematics - summation of terms. First, multiply the inputs by a set of weights, then add the terms together.
[caption id="attachment_458" align="alignnone" width="300"]
A basic artificial neuron[/caption]
There is one more thing that most neurons do in addition to summing terms. They also have a function which determines if they activate. Activation functions ensure that networks do not become oversaturated : essentially belching out enormous, nonsensical numbers to every input they are given. There are lots of activation functions: sigmoid, hyperbolic tangent, and the currently popular rectified linear unit, or ReLU.
Early neural networks in the 1950s and 1960s were simple and shallow. This means that they typically had only an input layer and an output layer. The reason? Computational complexity. Neural network connectivity can increase rapidly, and early computers were unable to process quickly enough to solve neural networks in near real-time, making them less useful for interesting applications.
[caption id="attachment_457" align="alignnone" width="277"]
A shallow neural network, consisting of only inputs and an output.[/caption]
Things began to get a little more interesting in the late 1980's and 1990's. Neural networks were becoming easier to train, thanks to improvements in computer processing power and algorithms. This led to the first widespread application of neural networks to problems people could wrap their heads around: character recognition.
It may not sound glamorous, but trying to figure out what humans write down is difficult (sometimes even for other humans!). But neural network-based models for identifying handwritten digits and characters (first introduced by Yann LeCunn in 1989) had become accurate enough to beat out other forms of optical character recognition, and soon were the most widely adopted approach for performing this type of task.
Then, about 2 decades later , things got really interesting. In 2012, the AlexNet topology (topology is the term used to describe the graph of neurons which makes up a neural network) was able to reach human capability in classifying images into one of 100 categories. It was also the first time that a new compute accelerator, the graphics processing unit (GPU) was used to train neural nets. Things haven't been the same since.
AlexNet is an example of a deep neural network. Deep neural networks have layers between the input and output layers. As researchers have attempted to expand and improve the capabilities of neural nets, they made the networks deeper. And deeper. And deeper, still. While early neural networks were only 2 layers deep, modern deep networks can consist of hundreds of layers.
Neural networks are now used for lots of different applications. In addition to recognizing handwritten characters and categorizing images, neural networks are used for predicting tides, playing video games, controlling autonomous cars, and identifying disease.
In fact, my own recent work has used disease identification as the use case, making it possible to train neural networks do identify pneumonia, emphysema, and a host of other illnesses and conditions from chest x-rays using many computers in parallel to train in minutes rather than days. It's the merger of HPC and AI, and that trend will continue on for many years to come. If you're interested, my recent presentation at Intel's Artificial Intelligence Developer's Conference (AIDC) is available online.
I hope this post helps give you a sense of what neural networks are, where they've been, and where they're headed in the future. I'll be writing more soon on the specifics of neural networks, from the types of layers that are used and what exactly they do, to the various means of training neural networks to do the task asked of them.
As always, please feel free to leave a comment, or to follow me on Twitter or LinkedIn. Also, don't forget to subscribe so that you always know when I publish new posts.
But there's a nature-inspired algorithm that is getting lots of attention these days: neural networks. People think of them as black boxes, but in this post I'll try to peel back the top and explain what's going on inside.
Structure of a Neural Network
Neural networks are computational graphs which were originally intended to mimic the behavior of the mammalian brain. They consist of computational nodes - akin to biological neurons - and edges which represent synaptic connections between neurons. The original goal was to better understand the workings of the brain by attempting to create an artificial version.
Neurons
Artificial neurons, or nodes, are rather simple computational units. They perform one of the most common operations in mathematics - summation of terms. First, multiply the inputs by a set of weights, then add the terms together.
[caption id="attachment_458" align="alignnone" width="300"]

There is one more thing that most neurons do in addition to summing terms. They also have a function which determines if they activate. Activation functions ensure that networks do not become oversaturated : essentially belching out enormous, nonsensical numbers to every input they are given. There are lots of activation functions: sigmoid, hyperbolic tangent, and the currently popular rectified linear unit, or ReLU.
Early Neural Networks
Early neural networks in the 1950s and 1960s were simple and shallow. This means that they typically had only an input layer and an output layer. The reason? Computational complexity. Neural network connectivity can increase rapidly, and early computers were unable to process quickly enough to solve neural networks in near real-time, making them less useful for interesting applications.
[caption id="attachment_457" align="alignnone" width="277"]

Fast Computers and GPUs
Things began to get a little more interesting in the late 1980's and 1990's. Neural networks were becoming easier to train, thanks to improvements in computer processing power and algorithms. This led to the first widespread application of neural networks to problems people could wrap their heads around: character recognition.
It may not sound glamorous, but trying to figure out what humans write down is difficult (sometimes even for other humans!). But neural network-based models for identifying handwritten digits and characters (first introduced by Yann LeCunn in 1989) had become accurate enough to beat out other forms of optical character recognition, and soon were the most widely adopted approach for performing this type of task.
Then, about 2 decades later , things got really interesting. In 2012, the AlexNet topology (topology is the term used to describe the graph of neurons which makes up a neural network) was able to reach human capability in classifying images into one of 100 categories. It was also the first time that a new compute accelerator, the graphics processing unit (GPU) was used to train neural nets. Things haven't been the same since.
Deep Neural Networks
AlexNet is an example of a deep neural network. Deep neural networks have layers between the input and output layers. As researchers have attempted to expand and improve the capabilities of neural nets, they made the networks deeper. And deeper. And deeper, still. While early neural networks were only 2 layers deep, modern deep networks can consist of hundreds of layers.
What Neural Networks Do Now
Neural networks are now used for lots of different applications. In addition to recognizing handwritten characters and categorizing images, neural networks are used for predicting tides, playing video games, controlling autonomous cars, and identifying disease.
In fact, my own recent work has used disease identification as the use case, making it possible to train neural networks do identify pneumonia, emphysema, and a host of other illnesses and conditions from chest x-rays using many computers in parallel to train in minutes rather than days. It's the merger of HPC and AI, and that trend will continue on for many years to come. If you're interested, my recent presentation at Intel's Artificial Intelligence Developer's Conference (AIDC) is available online.
I hope this post helps give you a sense of what neural networks are, where they've been, and where they're headed in the future. I'll be writing more soon on the specifics of neural networks, from the types of layers that are used and what exactly they do, to the various means of training neural networks to do the task asked of them.
As always, please feel free to leave a comment, or to follow me on Twitter or LinkedIn. Also, don't forget to subscribe so that you always know when I publish new posts.