Skip to main content

Genetic Algorithms: Selection

In Genetic Algorithms Basics and my post on Encoding Individuals, I talked about the basic concepts and operations in all genetic algorithms, and considerations to keep in mind when encoding individuals for GAs.

This time I want to spend some time discussing selection. Selection is the most important operation in the GA, and it performs two important functions. First, selection is how we determine who mates (and who doesn't), and as a result which traits survive to the next generation (and which do not). Second, selection acts as the major source of convergence in the GA. The way in which you select individuals for mating can have an enormous effect on whether you converge to a suboptimal solution, or manage to find a near-global optimum.


Selection Methods


Once you have determined the fitness of each individual in the population, you have to select which individuals will mate and which will not. The method used for this determines the selection pressure exerted on the population. High selection pressure will push the population toward convergence faster. Lower pressure will permit the population to explore the search space more thoroughly.

Let's look at a few different selection methods and discuss how they work and the amount of selection pressure they exert on the population
Survival of the Fittest

The highest selection pressure is exerted by the survival of the fittest selection approach. The basic idea of this approach is that you determine the fitness of each individual, sort them in descending order, and the top t% get to mate. This approach makes sense at first glance: when you are trying to optimize to a particular function, let only the best individuals with respect to that function survive and mate. Unfortunately, function optimization is often a deceptive business. Going all out, head down, toward what you believe will be the optimal solution will often leave you in a suboptimal hole. Survival of the fittest will get your population to converge faster, but you likely will not be happy with the solution the population converges to. This particular situation is known as premature convergence.
Roulette Wheel

The most common selection method used in GAs is fitness proportional, or roulette wheel, selection. This method still has relatively high selection pressure, but not nearly as strongly as survival of the fittest. Roulette wheel selection involves assigning a normalized proportional importance to each individual such that the sum of all individuals' importance is equal to 1 (or whatever normalized value you choose). More fit individuals are given a larger proportional importance, while less fit individuals are given a smaller importance. The selection algorithms then picks a number between 0 and 1 (or whatever normalized value you chose) at random, and the individual associated with that position in the spectrum is selected for mating. You can think of it as a roulette wheel where the values on the wheel are mapped to particular individuals. Repeat the process as many times as necessary to select the desired number of individuals.
Tournament

The tournament selection algorithm works exactly as it sounds: pick two (or more) individuals from the population at random, and select the best from the tournament. Tournament selection doesn't exert as much selection pressure on the population as roulette wheel, as it is not guaranteed to prefer the highest ranked individuals. It is equally likely that the tournament method will pit two low fitness individuals against each other, but the winner of that tournament still gets to mate. This allows individuals who do not score high in fitness determination to survive, which can be important in search spaces that may be deceptive.
Random Sampling

The random sampling selection method is the lowest pressure approach, and for good reason. Random sampling doesn't use the fitness value when determining survival, it is simply luck of the draw. This approach would likely never be used by a GA developer, but it allows us to cover the entire range of potential selection options, from highest selection pressure to lowest selection pressure.
Nature-inspired algorithms are not only black boxes; they are in some ways black magic. A sort of computational dark art where the intuition and skill of the algorithm's creator determines how well the algorithm will work.

Selection Pressure and Convergence


I've been talking a lot about selection pressure and premature convergenceand I think it's important to fully discuss these particular aspects of GAs. Carefully controlling the pressure on the population to converge to a solution is the most artful piece of developing GAs. There are no hard and fast rules here: it is entirely up to the developer - the creator of this little world of evolving "life forms" - to decide how much diversity the ecosystem will have.

For those who are not so biologically inclined, I like to think of genetic algorithms like stars (as an aside, my apologies to any astrophysicists offended if my metaphor is not physically accurate. Please feel free to take a number behind any evolutionary biologists who may be equally upset with my biological metaphor). Selection (and to some extent, crossover) apply inward pressure on the population, in much the same way that gravity constantly pulls the mass of a star towards the center. Mutation applies an outward, divergent, force on the star, much like the energy of the fusion reactions occurring in the star's core pushing on the star to expand. These two forces in a star must be perfectly balanced. For GAs, a neutron star or black hole is the desired result-eventually. We want convergence in the system, but it shouldn't happen too quickly.

Premature convergence occurs when the population loses diversity too quickly. It is the population's diversity which allows the algorithm to search for optimal solutions. When the diversity is gone, the search space is small and the optimization potential is low. When the diversity is too high, the population could take too long to converge, resulting in a non-convergent population which doesn't yield a useful solution.

If this all sounds like hand-waving and gut feel, that's because a lot of it is. There have been complaints about neural networks being black boxes, and to a great extent this is true. Nature-inspired algorithms are not only black boxes; they are in some ways black magic. A sort of computational dark art where the intuition and skill of the algorithm's creator determines how well the algorithm will work. In fact, it isn't technically correct to call them algorithms at all. They are meta-heuristics: a set of overly broad instructions which, in the hands of a skilled magician, can yield extraordinary results.

I hope you have gained some insight into how to craft a selection algorithm for you particular genetic algorithm, or have gained a better understanding of the concept in general. As always please feel free to leave me comments, or reach out to me on Twitter and LinkedIn.

Popular posts from this blog

Neural Network Dense Layers

Neural network dense layers (or fully connected layers) are the foundation of nearly all neural networks. If you look closely at almost any topology, somewhere there is a dense layer lurking. This post will cover the history behind dense layers, what they are used for, and how to use them by walking through the "Hello, World!" of neural networks: digit classification.

Arrays of Structures or Structures of Arrays: Performance vs. Readability

It's one of those things that might have an obvious answer if you have ever written scientific software for a vector machine. For everyone else, it's something you probably never even thought about: Should I write my code with arrays of structures (or classes), or structures (or classes) of arrays. Read on to see how both approaches perform, and what kind of readability you can expect from each approach.

Neural Network Pooling Layers

Neural networks need to map inputs to outputs. It seems simple enough, but in most useful cases this means building a network with millions of parameters, which look at millions or billions of relationships hidden in the input data. Do we need all of these relationships? Is all of this information necessary? Probably not. That's where neural network pooling layers can help. In this post, we're going to dive into the deep end and learn how pooling layers can reduce the size of your network while producing highly accurate models.