Hybrid Evolutionary Algorithms on Minimum Vertex Cover for Random Graphs
Spurious Dependencies and EDA Scalability
1. Spurious Dependencies and EDA Scalability
Elizabeth Radetic and Martin Pelikan
Missouri Estimation of Distribution Algorithms Laboratory (MEDAL)
University of Missouri, St. Louis, MO
http://medal.cs.umsl.edu/
pelikan@cs.umsl.edu
Download MEDAL Report No. 2010002
http://medal.cs.umsl.edu/files/2010002.pdf
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
2. Motivation
Estimation of distribution algorithms (EDAs)
Replace standard crossover and mutation by
building a probabilistic model of selected solutions, and
sampling the probabilistic model to generate new solutions.
Can solve many problems intractable with standard EAs.
Model accuracy
It is important that the EDA model is accurate.
Types of inaccuracies for dependency-based models
Missing dependencies.
Spurious, unnecessary dependencies.
Most prior work focused on missing dependencies.
This study
Focus on effects of spurious dependencies.
Theoretical study for population sizing.
Empirical study for the number of generations.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
3. Outline
1. Model accuracy.
2. Spurious dependencies
Model for spurious dependencies.
Effects on population sizing.
Effects on the number of generations.
3. Experiments.
4. Conclusions and future work.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
4. Dependency-Based Probabilistic Models in EDAs
Dependency-based probabilistic models
Encode dependencies and independencies between variables.
Dependency structure decomposes the problem.
Subproblems should be of bounded order.
Examples
Marginal product models.
Bayesian networks.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
5. Marginal Product Model
Beyond Pairwise Dependencies: ECGA
Variables are divided into linkage groups.
Defines problem decomposition into separable subproblems.
! Extended Compact GA (ECGA) (Harik, 1999).
Distribution of each group encoded by probability table.
We Consider groups of string positions.solutions.
!
assume binary representation of candidate
String Model
!!!
Martin Pelikan, Probabilistic Model-Building GAs
32
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
6. Model Accuracy
Types of inaccuracies
Missing dependencies.
Spurious, unnecessary dependencies.
Example: Trap-5
n/5
ftrap5 (X1 , . . . , Xn ) = i=1 trap5 (X5i−4 + X5i−3 + X5i−2 + X5i−1 + X5i )
5 if u = 5
trap5 (u) =
4−u otherwise
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
7. Onemax Model of Spurious Dependencies
Onemax is the sum of bits in the binary string
n
onemax(X1 , . . . , Xn ) = i=1 Xi
Perfect and spurious models for onemax
Perfect model assumes no dependence at all.
Spurious model assumes linkage groups of order kspurious > 1.
Parameter kspurious controls order of spurious dependencies.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
8. Effects of Spurious Models on EDA Performance
Two main effects of spurious dependencies
Population size.
Number of generations.
Population sizing decomposition
Population size requirements should increase
Effects depend on learning, but sometimes substantial.
Number of generations
Number of generations may decrease due to weaker variation.
Effects not expected substantial.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
9. EDA Population Sizing and Spurious Dependencies
Population sizing decomposition
Initial supply
Initial population is random.
Ensure sufficient supply of partial solutions for each group.
Decision making
Decision making between partial solutions is stochastic.
Ensure that best partial solution wins in each group.
Model building
Ensure accurate enough models to find the optimum.
The reason for spurious dependencies, not the effect.
Focus in this work
Initial supply.
Decision making.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
10. Population Sizing: Initial Supply
Initial supply for perfect model (Goldberg et al., 2001)
N = 2 ln 2m
Initial supply for arbitrary kspurious
n
N = 2kspurious kspurious ln 2 + ln
kspurious
Initial-supply population increase factor
n
kspurious ln 2 + ln kspurious
γis = 2kspurious −1
ln 2 + ln n
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
11. Population Sizing: Decision Making
Decision making for perfect model (Harik et al., 1997)
1
N = − ln α π(n − 1)
2
Decision making for arbitrary kspurious
N = −2kspurious −2 ln α π(n − 1)
Decision-making population increase factor
γdm = 2kspurious
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
12. Number of Generations
Effects of spurious dependencies on number of generations
Spurious dependencies weaken the mixing.
This reduces the effects of variation.
This should reduce the number of generations until
convergence (assuming a large enough population).
No theoretical model as of now.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
13. Description of Experiments
Operators
Binary tournament selection without replacement.
Three replacement types
Full replacement.
Elitist replacement (50% worst are replaced).
Restricted tournament replacement (niching).
Models with various levels of spurious linkage.
Parameters
Optimal population size obtained by bisection.
Runs stop when a solution close enough to the optimum is
reached (allow one linkage group to end up incorrect).
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
14. Population Size (Full Replacement)
Population size ratio
1000 Gambler’s ruin Gambler’s ruin
Initial supply 16
Population size
Initial supply
800 Experiment Experiment
12
600
400 8
200 4
0
1 1.5 2 2.5 3 3.5 4 4.5 5 1 2 3 4 5
Spurious linkage group size Spurious linkage group size
(a) Population size (b) Population size ratio
owth of the population size with respect spurious is exponential. a problem
Increase of population size with k to the group size for
side shows the actual population sizes compared to the theoretical mo
Theory provides a conservative bound.
d side shows the ratio of the population sizes with spurious linkage and th
spurious linkage.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
500
15. 1 1.5 2 2.51.5 3.52.5 1.5 3.5 2.5 4.5 3.5 4 4.5 5 2 1
1 3 2 1 4.5 2 4 3 5
4 3 5 1 3 2 1 4 3 2 5 4 3 5 4 5
Population Size group sizegroup size group size Strategies) linkage group size
(All Replacement linkage group sizegroup size
Spurious linkage linkage
Spurious Spurious linkage Spurious Spurious linkage
Spurious
(a) Population size Population size(b) Population size ratio
(a) Population size
(a) (b) Population size ratio
(b) Population size ratio
wth of theGrowth population size with respect respect sizethe size forsize for 300 of 300
Figure 2: population size with respect to the group group group a problem bits.
2: Growth of the of the population size with to the to for a problem of a problem
The left-hand side the actual population sizes compared to the theoretical model, mod
t-hand side the actual population sizes compared to the theoretical model, whereas
side shows shows shows the actual population sizes compared to the theoretical whe
the right-hand sidethe ratiopopulation sizes with spurious linkagelinkage linkage and the
t-hand side the ratio of the ofratiopopulation sizes with spurious and the population
side shows shows shows the the of the population sizes with spurious and the popula
sizes with no spurious linkage.
th no spurious linkage.
purious linkage.
Full replacement Elitist replacement RTR
500 500 500
1200 size 1200 size
blem Problem Problem size 1000 Problem size 1000 size
1000 Problem Problem size Problem size
Problem size
Problem size
00 300 300 300 300 300 300 300 300
Population size
Population size
Population size
Population size
Population size
Population size
Population size
1000 1000 400 400 400
40 240 240 800 800
240 800
240 240 240 240 240
800
80 800
180 180 180 180 180 300 180
300 180
300 180
20 120 120 600 600
120 600
120 120 120 120 120
600 600 60 60 60
60 60 60 400 400 400 200 60
200 60
200 60
400 400
200 200 200 200 200 100 100 100
0 0 0 0 0 0 0 0
5 2 2.51.5 3.52.5 1.5 3.5 2.5 4.5 3.5 4 4.51 51.5 2 2.51.5 3.52.5 1.5 3.5 2.5 4.5 3.5 4 4.5 1.5 2 2.51.5 3.52.51.5 3.52.5 4.53
1 3 2 1 4.5 2 4 3 5
4 3 5 1 3 2 1 4.5 2 4 3 5
4 3 5 1 5 1 3 2 1 4 4.52 5 4 3
3
up size (bits per group)per group) per group)Group size (bits per group)per group) per group)
Group sizeGroup size (bits
(bits Group sizeGroup size (bits
(bits Group size (bits per group)per group)
Group sizeGroup size (bits
(bits
(a) Full replacement
replacementFull replacement(b) Elitist replacement
(a) (b) Elitist replacement
(b) Elitist replacement (c) RTR(c) RTR(c) RTR
gure Figure theGrowth population size with respect respect spurious linkage linkage size.
Growth of 3: population size with respect to the spurious linkage group size. grou
3: Growth of the of the population size with to the to the spurious group
Increase of population size with kspurious similar in all cases.
ows the averageaverage number of spurious linkage groups (groups at leastat leasteach p
1(a) shows the number of spurious linkage groups (groups at size of size each prob- for
average number of spurious linkage groups (groups of size of least 2) for 2) for 2)
results results resultsthe number of the number of such groups increases approximately liw
. The indicate that indicate that suchof such groups increases approximately linearly
em size. The indicate that the number groups increases approximately linearly with
m size. Figure 1(b) the average size ofaverage spurious linkage linkage groups. For size, pr
Figure 1(b) shows shows the average spurious linkage groups.groups. For each problem
problem size. Figure 1(b) shows the size of size of spurious For each problem each
rious linkagelinkage linkage groups is close to two, indicating thatlinkage linkage groups w
the spurious groups groups is close to two, indicating that linkage groups groups were cre
of size of spurious is close to two, indicating that larger larger larger were created
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
16. Number of Generations (All Replacement Strategies)
Full replacement Elitist replacement RTR
1e+07 1e+07 1e+07
Number of generations
Number of generations
Number of generations
Number of generations
Number of generations
Number of generations
Number of generations
80 80
Problem size Problem size 80
Problem size 80 80
Problem size
Problem size
Problem size Problem size
Problem size
Problem siz
120
70 300 70 300 120 12070 300 120
70 300 70 300 120 120 1e+06 1e+06
300 1e+06
300 300
240 60 60240 60 6060 240 240 60 60240 60 60 100000 240 240 240
60 60 100000 100000
180 180 180
180 180 180 180 180
50 50 50 50 50 10000 120
10000 120
10000 120
40 40 40 40 40 60 60 60
1000 1000 1000
30 30 30 30 30
20 20 20 20 20 100 100 100
10 10 10 10 10 10 10 10
2 2.5 1.5 3.51 41.5 2 52.5 3 3.5 4 4.51.5
1 3 2 2.5 4.5 3.5 4 4.5 5
3 1 5 2 2.5 1.5 3.51 41.5 2 52.5 3 3.5 4 4.5 5 1.5 2 2.5 3 3.5 4 1 1.5 2 2.5
1 3 2 2.5 4.5 3.5 4 4.5 5
3 1 1 1.5 2 2.5 3 5
4.5 3.5 4 4
p SizeGroup Size (bits per group)per group)
(bits per group) Size (bits
Group Group size Group size (bits per group)per group) Group size Group size (bits per gro
(bits per group) size (bits
Group (bits per group) size (b
Group
replacement Full replacement Elitist replacement
(a) Full (a)
replacement (b) (b) Elitist replacement
(b) Elitist replacement (c) RTR (c) RTR(c) RTR
owthGrowthFullthe numberreplacement respect respectrespect spurious linkagesize.
e Figure the Growth of the number of generations with spurious linkage group linkage
4: of 4: number of generations with with to the to the to the spurious group
of and elitist of generations
Number of generations slightly decreases with kspurious .
240 240
Niching (restricted tournament replacement) 20000 20000 20000
200 200 200
er of generations
er of generations
er of generations
er of evaluations
er of evaluations
er of evaluations
220 220 180 180 180 18000 18000 18000
pulation size
200 200 Number of 160generations dramatically increases!
160 160 16000 16000 16000
180 180 140 140 Full repl. Full repl. Full repl. 14000
140 14000 Full repl. Full repl. Full
14000
160 Full 160 Full repl. Full repl. 120 120 Elitist repl. Elitist repl.Elitist
repl. Elitist repl. Elitist repl.Elitist repl. 12000
120 12000 12000
RTR repl. RTR repl. RTR
140 Elitist 140 Elitist repl.Elitist repl. 100
repl. 100RTR repl. RTR repl. RTR repl. 10000
100 10000 10000
120 RTR 120 RTR repl. RTRPelikan 80
Elizabeth Radetic and Martin repl.
repl. 80 80 Spurious Dependencies and EDA Scalability 8000
8000 8000
17. Spurious Linkage in Multivariate EDAs
Experiment
Use optimal population size in ECGA.
Observe spurious dependencies in actual models.
Avg. number of groups > 1
140
Avg. size of groups > 1
Replacement 2.05 Replacement 1.8 Replacement
Average group size
120 RTR 1.75
RTR 2.045 RTR
100 Elitist Elitist 1.7 Elitist
Full 2.04 Full Full
80 1.65
2.035 1.6
60 2.03 1.55
40 2.025 1.5
20 2.02 1.45
0 2.015 1.4
50 100 150 200 250 300 50 100 150 200 250 300 50 100 150 200 250 300
Problem size (number of bits) Problem size (number of bits) Problem size (number of bits)
(a) Number of spurious linkage (b) Avg. size of spurious linkage (c) Average linkage group size
groups groups
Figure 1: The average number of spurious linkage groups (groups of size ≥ 2), the average size
of linkage groups of size ≥ 2, and the average linkage group size (including all linkage groups) for
ECGA on onemax. Three replacement strategies are considered: full replacement, elitist replace-
ment and RTR. For each problem size and replacement strategy, the results represent an average
over 100 runs (10 bisections of 10 runs each).
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
18. Conclusions and Future Work
Conclusions
Population size increases exponentially with kspurious .
Number of generations mostly unaffected.
But for niching, the number of generations skyrocks!
Spurious dependencies should not be ignored.
Future work
From our model to multivariate EDAs
In most EDAs population sizing driven by model building.
Almost always the models contain spurious dependencies.
How do the models interact?
Dramatic increase in the number of generations with niching
Explain why.
Propose ways to deal with it.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability
19. Acknowledgments
Acknowledgments
NSF; NSF CAREER grant ECS-0547013.
University of Missouri; High Performance Computing
Collaboratory sponsored by Information Technology Services;
Research Award; Research Board.
Elizabeth Radetic and Martin Pelikan Spurious Dependencies and EDA Scalability