This paper analyzes the relative advantages between crossover and mutation on a class of deterministic and stochastic additively separable problems with substructures of non-uniform salience. This study assumes that the recombination and mutation operators have the knowledge of the building blocks (BBs) and effectively exchange or search among competing BBs. Facetwise models of convergence time and population sizing have been used to determine the scalability of each algorithm. The analysis shows that for deterministic exponentially-scaled additively separable, problems, the BB-wise mutation is more efficient than crossover yielding a speedup of Θ(l logl), where l is the problem size. For the noisy exponentially-scaled problems, the outcome depends on whether scaling on noise is dominant. When scaling dominates, mutation is more efficient than crossover yielding a speedup of Θ(l logl). On the other hand, when noise dominates, crossover is more efficient than mutation yielding a speedup of Θ(l).
Let's get ready to rumble redux: Crossover versus mutation head to head on exponentially scaled problems
1. Let’s Get Ready to Rumble Redux:
Crossover vs. Mutation Head to Head
on Exponentially-Scaled Problems
Kumara Sastry1,2 and David E. Goldberg1
1Illinois
Genetic Algorithms Laboratory
2Materials Computation Center
University of Illinois at Urbana-Champaign, Urbana IL 61801
http://www.illigal.uiuc.edu
ksastry@uiuc.edu, deg@uiuc.edu
Supported by AFOSR FA9550-06-1-0096 and NSF DMR 03-25939.
2. Motivation
Great debate between crossover and mutation
When mutation works, it’s lightning quick
When crossover works, it tackles more complex problems
Compare crossover and mutation where both operators
have access to same neighborhood information
Local search literature
Emphasis on good neighborhood operators [Barnes et al, 2003;
Watson, 2003; Hansen et al, 2001]
Need for automatic induction of neighborhoods
Leads to adaptive time continuation operator [Lima et al 2005,
2006, 2007]
3. Outline
Related work
Assumption of known or discovered linkage
Objective
Algorithm Description
Scalability analysis: Crossover vs. Mutation
Known or discovered linkage
Exponentially scaled additively-separable problem with and
without Gaussian noise
Summary and Conclusions
4. Background
Emprical studies comparing crossover and mutation
Scalability of GAs and mutation-based hillclimber
[Mühlenbein, 1991 & 1992; Mitchell, Holland, and Forrest, 1994; Baum, Boneh, and
Garett, 2001; Dorste, 2002; Garnier, 1999; Jansen and Wegener, 2002, 2005]
Single GA run with large population vs. multiple GA runs
with small population at fixed computational cost
[Goldberg, 1999; Srivastava & Goldberg, 2001; Srivastava, 2002; Cantú-Paz &
Goldberg, 2003; Luke, 2001; Fuchs, 1999]
Used fixed operators that don’t adapt linkage
Did not consider problems of bounded difficulty
Linkage and neighborhood information is critical
5. Known or Discovered Linkage
Assumption of known or induced linkage
Can use linkage-learning techniques
Linkage information is critical for selectorecombinative GA
success
Exponential Polynomial
Scalability
Pelikan, Ph.D. Thesis, 2002
Provide the same information for mutation
Mutation searches in the building-block subspace
6. Algorithm Description
Selectorecombinative genetic algorithm
Population of size n
Binary tournament selection
Uniform building-block-wise crossover
BBs #1 and #3 exchanged
Exchange BBs with probability 0.5
Selectomutative genetic algorithm
Start with a random individual
Enumerative BB-wise mutation
Consider BB partitions
– Arbitrary left-to-right order
Choose the best schemata
– Among the 2k possible ones
7. Crossover Versus Mutation: Uniform Scaling
Deterministic fitness: Noisy fitness: Recombination
Mutation is more efficient is more efficient
[Sastry & Goldberg, 2004]
8. Objective
Crossover and mutation both have access to same
neighborhood information
Known or discovered linkage
Recombination exchanges building blocks
Mutation searches for the best BB in each partition
Compare scalability of crossover and mutation
Additively separable problems with exponentially-scaled BBs
With and without additive Gaussian noise
Where do they excel?
Derive, verify, and use facetwise models
Convergence time and population sizing
9. Scaling and Noise Cover Most Problems
Adversarial problem design [Goldberg, 2002]
Fluctuating
R
P Noise
Deception Scaling
Noisy BinInt
10. Convergence Time for Crossover:
Deterministic Fitness Functions
Selection-Intensity based model [Rudnick, 1992; Thierens et al, 1998]
Derived for the BinInt problem
Applicable to additively-separable problems
Selection Intensity
Problem size (m·k )
11. Population Sizing for Crossover:
Deterministic Fitness Functions
Domino convergence [Rudnick, 1992]
Proportion
Most Least
salient salient
BB convergence in order of salience
...
Drift bound dictates population sizing
Drift time [Goldberg and Segrest, 1987] time
Size the population such that:
Population size:
12. Scalability Analysis of Crossover & Mutation:
Deterministic Fitness Functions
Selectorecombinative GA
Population size:
Convergence time:
Number of function evaluations:
Selectomutative GA
Initial solution is evaluated once
2k –1 evaluations in each of m partitions
13. Crossover vs. Mutation:
Deterministic Fitness Functions
Speed-Up: Scalability ratio of mutation to that of crossover
14. Convergence Time for Crossover:
Noisy Fitness Functions
Additive Gaussian noise with variance σ2N
Set proportional to maximum fitness variance
Scaling dominated:
Noise dominated:
15. Population Sizing for Crossover:
Noisy Fitness Functions
Scaling dominated:
Noise dominated:
16. Scalability Analysis of Mutation:
Noisy Fitness Functions
Fitness should be sampled to average out noise
What should the sample size, ns, be?
BB-wise decision making [Goldberg, Deb, & Clark, 1992]
Square of the ordinate of a one-sided
Gaussian deviate with specified error
probability, α
17. Scalability Analysis of Crossover & Mutation:
Noisy Fitness Functions
Selectorecombinative GA
Selectomutative GA
Fitness of each individual is sampled ns times
2k –1 evaluations in each of m partitions
18. Crossover vs. Mutation: Noisy BinInt
Speed-Up: Scalability ratio of crossover to that of mutation
19. Summary
Deterministic fitness: Noisy fitness: Recombination
Mutation is more efficient is more efficient in noise
dominated regime
20. Conclusions
Good neighborhood information is essential
Quadratic scalability of crossover and mutation
Exponential scalability of simple crossover [Thierens & Goldberg,
1994]
ekmk scalability of simple mutation [Mühlenbein, 1991]
Leads to a theory of time continuation
Key facet of efficiency enhancement
Leads to principled design and development of adaptive
time continuation operators
Promise of yielding supermultiplicative speedups