For many optimization problems it is possible to define a distance metric between problem variables that correlates with the likelihood and strength of interactions between the variables. For example, one may define a metric so that the dependencies between variables that are closer to each other with respect to the metric are expected to be stronger than the dependencies between variables that are further apart. The purpose of this paper is to describe a method that combines such a problem-specific distance metric with information mined from probabilistic models obtained in previous runs of estimation of distribution algorithms with the goal of solving future problem instances of similar type with increased speed, accuracy and reliability. While the focus of the paper is on additively decomposable problems and the hierarchical Bayesian optimization algorithm, it should be straightforward to generalize the approach to other model-directed optimization techniques and other problem classes. Compared to other techniques for learning from experience put forward in the past, the proposed technique is both more practical and more broadly applicable.
Salesforce Miami User Group Event - 1st Quarter 2024
Distance-based bias in model-directed optimization of additively decomposable problems
1. Distance-‐Based
Bias
in
Model-‐Directed
Op3miza3on
of
Addi3vely
Decomposable
Problems
Mar3n
Pelikan
and
Mark
W.
Hauschild
Missouri
Es3ma3on
of
Distribu3on
Algorithms
Laboratory
Department
of
Mathema3cs
and
Computer
Science
University
of
Missouri,
St.
Louis,
MO
E-‐mail:
mar3n@mar3npelikan.net
WWW:
hKp://mar3npelikan.net/
1
2. Background
• Model-‐directed
op3mizers
(MDOs)
learn
and
use
models
in
op3miza3on
to
solve
difficult
op3miza3on
problems
scalably
and
reliably.
• MDOs
oPen
provide
more
than
the
solu3on;
they
provide
a
set
of
models
that
reveal
informa3on
about
the
problem.
• Learning
from
experience:
Use
models
from
prior
runs
of
MDOs
to
introduce
bias
when
solving
problems
of
similar
type
in
future.
2
3. Purpose
• Combine
prior
models
with
a
problem-‐specific
distance
metric
to
solve
new
problem
instances
with
increased
speed,
accuracy,
reliability.
• Demonstrate
significant
speedups
across
a
broad
array
of
problem
domains.
• Focus
on
hBOA
algorithm
and
addi3vely
decomposable
func3ons,
although
the
approach
can
be
generalized
to
other
MDOs
and
other
problem
classes.
3
4. Outline
1. Hierarchical
BOA
(hBOA).
2. Distance
metric
for
ADFs.
3. Learning
from
experience
via
distance-‐based
bias.
4. Experiments.
5. Summary
and
conclusions.
4
5. Hierarchical
Bayesian
Op3miza3on
Algorithm
(hBOA)
Current Bayesian New
population Selection network population
[Pelikan, Goldberg, & Cantu-Paz, 2001] 5
6. Decision
Trees
Represent
Dependencies
Dependency X2
X1
X3
X4
Decision tree
Probability table (more efficient)
6
7. Learning
from
Experience
(Transfer
Learning)
• Mo3va3on
– When
solving
a
problem,
hBOA
provides
the
user
with
a
set
of
probabilis3c
models.
– Each
model
encodes
informa3on
about
the
problem,
such
as
dependencies
between
variables.
– Why
not
use
this
informa3on
when
solving
new
problem
instances
of
similar
type?
• Example:
hBOA
solves
99
scheduling
problems;
why
not
use
the
knowledge
obtained
when
solving
the
100th
instance?
7
8. How
to
Make
it
Work?
• It
is
straighborward
to
keep
sta3s3cs
from
past
hBOA
runs,
for
example,
capturing
the
number
of
dependencies
between
any
pair
of
variables.
• In
hBOA,
this
can
be
done
by
looking
at
the
number
of
“splits”
on
variable
Xi
in
a
decision
tree
storing
dependencies
for
variable
Xj.
• But
it
is
important
to
ensure
that
the
sta3s3cs
are
meaningful
with
respect
to
the
problem
being
solved,
so
that
the
sta3s3cs
help
us
solve
future
problem
instances
faster
and
beKer.
8
9. Learning
from
Experience
via
Distance-‐Based
Bias:
Basic
Idea
• Learning
from
experience
using
distance-‐based
bias
– Define
distances
between
problem
variables.
– Mine
probabilis3c
models
from
previous
runs
for
model
regulari3es
with
respect
to
distances.
• Mine
models
to
es3mate
how
strongly
variables
influence
each
other
depending
on
their
distance.
– This
should
work
whenever
strength
of
dependencies
is
correlated
with
distance.
• Apply
idea
to
hBOA
and
addi3vely
decomposable
func3ons.
9
10. Addi3vely
Decomposable
Func3ons
• Addi3vely
decomposable
func3on
(ADF):
– {Si}
are
subsets
of
variables.
– {fi}
are
func3ons
defining
overall
solu3on
quality.
• Addi3vely
decomposable
func3ons
are
oPen
difficult
to
solve!
Many
NP-‐complete
problems
are
ADFs
with
subproblems
of
2
or
3
variables.
10
11. Define
Distance
Metric
for
ADFs
Using
Dependency
Graph
• Create
a
dependency
graph
where
variables
in
the
same
subset
Si
are
connected.
• Define
distance
between
variables
as
shortest
path
between
them
in
the
dependency
graph.
• If
there
exists
no
such
path,
set
distance
to
the
number
of
variables
(any
exis3ng
path
is
shorter).
[Hauschild et al., 2008] 11
12. Define
Distance
Metric
for
ADFs
Using
Dependency
Graph:
Example
[Hauschild et al., 2008] 12
13. Mo3va3ng
Example
• Propor3ons
of
splits
for
variables
at
various
distances
shows
evident
correla3on
between
the
two:
NK landscapes 2D spin glass
13
14. Details
of
the
Approach
• Denote
by
M
the
set
of
models
from
prior
runs.
• Record
the
number
of
splits
on
any
variable
Xi
in
any
decision
tree
Xj
in
model
m
such
that
distance
of
Xi
and
Xj
is
d
• Compute
probability
of
kth
split
on
variable
Xi
in
any
decision
tree
Xj
such
that
dist.
of
Xi
and
Xj
is
d
assuming
(k-‐1)
such
splits:
14
15. Details
of
the
Approach
• Set
prior
probability
of
network
structure
based
on
the
learned
probabili3es
(kappa
denotes
strength
of
bias)
• Evaluate
each
network
using
a
Bayesian
metric
15
16. Test
Problems
• Included
in
this
paper
– NK
landscapes
with
nearest-‐neighbor
interac3ons.
– 2D
spin
glass.
• Done
later
on
– 3D
spin
glass.
– Minimum
vertex
cover
for
random
graphs.
– MAXSAT
for
3-‐CNF
formulas.
• Large
number
of
different
instances
for
each
problem
class
(100s
to
1000s
each).
16
17. Experimental
Methodology
• 10-‐fold
crossvalida3on
– Divide
instances
into
10
sets.
– Test
bias
from
models
on
9
sets
on
remaining
1
set,
repeat
for
every
set.
– BoKom
line:
Any
problem
instance
is
never
used
for
both
crea3ng
the
bias
and
tes3ng
it.
• Bisec3on
for
gemng
popula3on
sizes,
10
runs
for
each
problem
instance.
• Focus
on
mul3plica3ve
speedups
– How
many
3mes
faster
with
the
use
of
bias?
17
23. More
Results
to
be
Published
Soon
• Nearly
iden3cal
speedups
if
bias
is
based
on
problems
of
smaller
size.
• Significant
speedups
even
if
bias
is
based
on
another
class
of
ADFs
(e.g.
models
from
NK
landscapes
used
to
solve
MVC).
• Nearly
mul3plica3ve
speedups
in
combina3on
with
other
efficiency
enhancements
(e.g.
sporadic
model
building).
• So
far
not
a
single
problem
class
for
which
the
bias
does
not
yield
significant
speedups.
23
24. Results
Applicable
in
Other
Contexts
• Approach
can
be
applied
to
other
model-‐
directed
op3mizers,
such
as
ECGA,
LTGA,
or
mGA.
• Approach
can
be
applied
to
other
problem
classes
for
which
a
distance
metric
can
be
defined,
such
as
QAP
or
scheduling
problems.
• This
work
demonstrates
the
poten3al,
but
more
work
to
be
done
in
future.
24
25. Summary
and
Conclusions
• Proposed
a
prac3cal
approach
to
using
models
from
prior
runs
of
model-‐directed
op3mizers
to
bias
op3miza3on
of
future
problem
instances.
• Demonstrated
significant
speedups
across
a
number
of
problem
domains
and
semngs,
including
a
number
scenarios
that
are
not
possible
with
related
techniques
proposed
in
the
past.
• Approach
is
ready
to
be
applied
in
a
different
context.
25
26. Acknowledgments
• Support
was
provided
by
– NSF
grants
ECS-‐0547013
and
IIS-‐1115352.
– ITS
at
the
University
of
Missouri
in
St.
Louis.
– University
of
Missouri
Bioinforma3cs
Consor3um.
• Get
the
papers
at
hKp://medal-‐lab.org/files/2012001.pdf
hKp://medal-‐lab.org/files/2012004.pdf
26