So far in my series on Genetic Algorithms I've covered: Basic Concepts, Encoding, Selection, and Crossover. Those operations will get you most of the way to a functioning optimizer. However, all of the operations we've discussed exert convergence pressure on the population. That is, selection and crossover have a tendency to draw the population together. The last operator I want to talk about - mutation - is different. It wants to split the population apart. And as counter intuitive as it seems, divergence can be even more important than convergence.
Divergence in a genetic algorithm is crucial if you want to end up with quality solutions. While the population itself provides some level of inherent space exploration (each individual represents a point in solution space), selection and crossover bring all of the individuals close together, essentially limited the solution space that is being explored. Mutation allows the algorithm to break out of this rut, exploring more of the solution space and avoiding local minima.
There are many different ways to mutate individuals in a population, and they fall into two basic categories: safe and unsafe. Safe mutations are those which change an individual while ensuring that the changed individual still falls within the constraints established for the population. Unsafe mutations create individuals which are not viable - they do not meet the constraints that you've established for individuals in the population.
It's important that you ensure safe mutations so that you are not randomly eliminating individuals in your population during the mutation phase. There are many ways to accomplish safe mutation. I'll talk about point mutations (single- and multi-point), and then I'll talk about more exotic forms of mutation which ensure safety for more complex individuals (like neural networks).
The two basic forms of mutation are single-point and multi-point mutation. Single-point mutation changes a single gene in the individual. If you are optimizing the weights in a neural network, for example, this would be changing a single weight at random. In a typical image classification network containing 4-12 million trainable parameters, this is a drop in the bucket, and can mean that meaningful change to your individual will be slow to occur.
Multi-point mutation, on the other hand, modifies many genes at once. How many is up to you, but like other operations in a genetic algorithm it's typically determined randomly. However, for complex individuals (again think neural networks) this can still take a long time to meaningfully change an individual.
Many genetic algorithms which operate on complex individuals use more exotic forms of mutation. These are all problem specific, and depend on what you are trying to optimize. A more recent example of an exotic mutation scheme is the safe mutation Uber is using for its neuroevolution code. In Uber's scheme, individuals are mutated by subjecting them to training by backpropagation with stochastic gradient descent (SGD). It ensures that the individual is safely mutated, and has a meaningful impact on the expressed solution that the individual represents. You can check out Uber's work on this and other neuroevolution strategies here.
There are lots of other ways to perform mutation, and this is by no means an exhaustive list. Many genetic algorithms will require custom operations because they use custom individual encodings. It's all a matter of trying things out, seeing what the effects are, and trying again if necessary.
If you enjoy these posts, feel free to follow me on Twitter and LinkedIn, and subscribe using the widget in the side bar to get notifications straight to you email.
Divergence in a genetic algorithm is crucial if you want to end up with quality solutions. While the population itself provides some level of inherent space exploration (each individual represents a point in solution space), selection and crossover bring all of the individuals close together, essentially limited the solution space that is being explored. Mutation allows the algorithm to break out of this rut, exploring more of the solution space and avoiding local minima.
Safe vs. Unsafe Mutation
There are many different ways to mutate individuals in a population, and they fall into two basic categories: safe and unsafe. Safe mutations are those which change an individual while ensuring that the changed individual still falls within the constraints established for the population. Unsafe mutations create individuals which are not viable - they do not meet the constraints that you've established for individuals in the population.
It's important that you ensure safe mutations so that you are not randomly eliminating individuals in your population during the mutation phase. There are many ways to accomplish safe mutation. I'll talk about point mutations (single- and multi-point), and then I'll talk about more exotic forms of mutation which ensure safety for more complex individuals (like neural networks).
Single-point vs. Multi-point Mutation
The two basic forms of mutation are single-point and multi-point mutation. Single-point mutation changes a single gene in the individual. If you are optimizing the weights in a neural network, for example, this would be changing a single weight at random. In a typical image classification network containing 4-12 million trainable parameters, this is a drop in the bucket, and can mean that meaningful change to your individual will be slow to occur.
Multi-point mutation, on the other hand, modifies many genes at once. How many is up to you, but like other operations in a genetic algorithm it's typically determined randomly. However, for complex individuals (again think neural networks) this can still take a long time to meaningfully change an individual.
Exotic Mutation Operations
Many genetic algorithms which operate on complex individuals use more exotic forms of mutation. These are all problem specific, and depend on what you are trying to optimize. A more recent example of an exotic mutation scheme is the safe mutation Uber is using for its neuroevolution code. In Uber's scheme, individuals are mutated by subjecting them to training by backpropagation with stochastic gradient descent (SGD). It ensures that the individual is safely mutated, and has a meaningful impact on the expressed solution that the individual represents. You can check out Uber's work on this and other neuroevolution strategies here.
There are lots of other ways to perform mutation, and this is by no means an exhaustive list. Many genetic algorithms will require custom operations because they use custom individual encodings. It's all a matter of trying things out, seeing what the effects are, and trying again if necessary.
If you enjoy these posts, feel free to follow me on Twitter and LinkedIn, and subscribe using the widget in the side bar to get notifications straight to you email.