Skip to main content

What are Neural Networks?

I used to work in HPC (as evidenced in my posts on AoS vs SoA and AoSoAs), but my interest has always been nature-inspired algorithms, like genetic algorithms.

But there's a nature-inspired algorithm that is getting lots of attention these days: neural networks. People think of them as black boxes, but in this post I'll try to peel back the top and explain what's going on inside.


Structure of a Neural Network


Neural networks are computational graphs which were originally intended to mimic the behavior of the mammalian brain. They consist of computational nodes - akin to biological neurons - and edges which represent synaptic connections between neurons. The original goal was to better understand the workings of the brain by attempting to create an artificial version.

Neurons


Artificial neurons, or nodes, are rather simple computational units. They perform one of the most common operations in mathematics - summation of terms. First, multiply the inputs by a set of weights, then add the terms together.

[caption id="attachment_458" align="alignnone" width="300"] A basic artificial neuron[/caption]

There is one more thing that most neurons do in addition to summing terms. They also have a function which determines if they activate. Activation functions ensure that networks do not become oversaturated :  essentially belching out enormous, nonsensical numbers to every input they are given. There are lots of activation functions: sigmoid, hyperbolic tangent, and the currently popular rectified linear unit, or ReLU.

Early Neural Networks


Early neural networks in the 1950s and 1960s were simple and shallow. This means that they typically had only an input layer and an output layer. The reason? Computational complexity. Neural network connectivity can increase rapidly, and early computers were unable to process quickly enough to solve neural networks in near real-time, making them less useful for interesting applications.

[caption id="attachment_457" align="alignnone" width="277"] A shallow neural network, consisting of only inputs and an output.[/caption]

Fast Computers and GPUs


Things began to get a little more interesting in the late 1980's and 1990's. Neural networks were becoming easier to train, thanks to improvements in computer processing power and algorithms. This led to the first widespread application of neural networks to problems people could wrap their heads around: character recognition.

It may not sound glamorous, but trying to figure out what humans write down is difficult (sometimes even for other humans!). But neural network-based models for identifying handwritten digits and characters (first introduced by Yann LeCunn in 1989) had become accurate enough to beat out other forms of optical character recognition, and soon were the most widely adopted approach for performing this type of task.

Then, about 2 decades later , things got really interesting. In 2012, the AlexNet topology (topology is the term used to describe the graph of neurons which makes up a neural network) was able to reach human capability in classifying images into one of 100 categories. It was also the first time that a new compute accelerator, the graphics processing unit (GPU) was used to train neural nets. Things haven't been the same since.

Deep Neural Networks


AlexNet is an example of a deep neural network. Deep neural networks have layers between the input and output layers. As researchers have attempted to expand and improve the capabilities of neural nets, they made the networks deeper. And deeper. And deeper, still. While early neural networks were only 2 layers deep, modern deep networks can consist of hundreds of layers.

What Neural Networks Do Now


Neural networks are now used for lots of different applications. In addition to recognizing handwritten characters and categorizing images, neural networks are used for predicting tides, playing video games, controlling autonomous cars, and identifying disease.

In fact, my own recent work has used disease identification as the use case, making it possible to train neural networks do identify pneumonia, emphysema, and a host of other illnesses and conditions from chest x-rays using many computers in parallel to train in minutes rather than days. It's the merger of HPC and AI, and that trend will continue on for many years to come. If you're interested, my recent presentation at Intel's Artificial Intelligence Developer's Conference (AIDC) is available online.

 

I hope this post helps give you a sense of what neural networks are, where they've been, and where they're headed in the future. I'll be writing more soon on the specifics of neural networks, from the types of layers that are used and what exactly they do, to the various means of training neural networks to do the task asked of them.

As always, please feel free to leave a comment, or to follow me on Twitter or LinkedIn. Also, don't forget to subscribe so that you always know when I publish new posts.

Popular posts from this blog

Neural Network Dense Layers

Neural network dense layers (or fully connected layers) are the foundation of nearly all neural networks. If you look closely at almost any topology, somewhere there is a dense layer lurking. This post will cover the history behind dense layers, what they are used for, and how to use them by walking through the "Hello, World!" of neural networks: digit classification.

Arrays of Structures or Structures of Arrays: Performance vs. Readability

It's one of those things that might have an obvious answer if you have ever written scientific software for a vector machine. For everyone else, it's something you probably never even thought about: Should I write my code with arrays of structures (or classes), or structures (or classes) of arrays. Read on to see how both approaches perform, and what kind of readability you can expect from each approach.

Neural Network Pooling Layers

Neural networks need to map inputs to outputs. It seems simple enough, but in most useful cases this means building a network with millions of parameters, which look at millions or billions of relationships hidden in the input data. Do we need all of these relationships? Is all of this information necessary? Probably not. That's where neural network pooling layers can help. In this post, we're going to dive into the deep end and learn how pooling layers can reduce the size of your network while producing highly accurate models.