ICCV2009: MAP Inference in Discrete Models: Part 4

Course programme
9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:30-17.30 Recent Advances: Dual-decomposition, higher-order, etc.
(Carsten Rother + Pawan Kumar)

All online material will be online (after conference):
http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/

E(x) = ∑ fi (xi) + ∑ gij (xi,xj) + ∑ hc(xc)
i ij c

Unary Pairwise Higher Order

∑ ci xi + ∑ dij |xi-xj| E: {0,1}n → R
i i,j n = number of pixels

Image Segmentation

n = Number of Variables
Segmentation Energy

NP-Hard
MAXCUT

CSP Pair-wise
O(n3)

Submodular
Functions
Tree
Structured O(n6)

Space of Problems

st-mincut and Pseudo-boolean
optimization

More General Minimization Problems

Speed and Efficiency

Pseudo-boolean function f{0,1}n  ℝ is submodular if
f(A) + f(B)  f(A˅B) + f(A˄B) for all A,B ϵ {0,1}n
(OR) (AND)

Example: n = 2, A = [1,0] , B = [0,1]
f([1,0]) + f([0,1])  f([1,1]) + f([0,0])

Property : Sum of submodular functions is submodular

Binary Image Segmentation Energy is submodular

E(x) = ∑ ci xi + ∑ dij |xi-xj|
i i,j

 Discrete Analogues of Concave Functions
[Lovasz, ’83]

 Widely applied in Operations Research

 Applications in Machine Learning
 MAP Inference in Markov Random Fields
 Clustering [Narasimhan , Jojic, & Bilmes, NIPS 2005]
 Structure Learning [Narasimhan & Bilmes, NIPS 2006]

 Maximizing the spread of influence through a social
network [Kempe, Kleinberg & Tardos, KDD 2003]

 Polynomial time algorithms
 Ellipsoid Algorithm: [Grotschel, Lovasz & Schrijver ‘81]
 First strongly polynomial algorithm: [Iwata et al. ’00] [A. Schrijver ’00]
 Current Best: O(n5 Q + n6) [Q is function evaluation time] [Orlin ‘07]

 Symmetric functions: E(x) = E(1-x)
 Can be minimized in O(n3)

 Minimizing Pairwise submodular functions
 Can be transformed to st-mincut/max-flow [Hammer , 1965]
 Very low empirical running time ~ O(n)

E(X) = ∑ fi (xi) + ∑ gij (xi,xj)
i ij

Source

2 9 Graph (V, E, C)
2 Vertices V = {v1, v2 ... vn}
v1 v2 Edges E = {(v1, v2) ....}

1 Costs C = {c(1, 2) ....}
5 4

Sink

What is a st-cut?

Source

2 9
2
v1 v2
1
5 4

Sink

What is a st-cut?
An st-cut (S,T) divides the nodes
between source and sink.
Source

2 9 What is the cost of a st-cut?
2 Sum of cost of all edges going
v1 v2 from S to T
1
5 4

Sink

5 + 1 + 9 = 15

What is a st-cut?
An st-cut (S,T) divides the nodes
between source and sink.
Source

2 9 What is the cost of a st-cut?
2 Sum of cost of all edges going
v1 v2 from S to T
1
5 4
What is the st-mincut?
Sink st-cut with the
minimum cost

2 + 2 + 4 = 8

Construct a graph such that:
1. Any st-cut corresponds to an assignment of x
2. The cost of the cut is equal to the energy of x : E(x)

S st-mincut

E(x)

T
Solution

[Hammer, 1965] [Kolmogorov and Zabih, 2002]

E(x) = ∑ θi (xi) + ∑ θij (xi,xj)
i i,j

For all ij θij(0,1) + θij (1,0)  θij (0,0) + θij (1,1)

Equivalent (transformable)

E(x) = ∑ ci xi + ∑ cij xi(1-xj) cij≥0
i i,j

E(a1,a2)

Source (0)

a1 a2

Sink (1)

E(a1,a2) = 2a1

Source (0)

2

a1 a2

Sink (1)

E(a1,a2) = 2a1 + 5ā1

Source (0)

2

a1 a2

5
Sink (1)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2

Source (0)

2 9

a1 a2

5 4
Sink (1)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2

Source (0)

2 9

a1 a2

2
5 4
Sink (1)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Source (0)

2 9
1
a1 a2

2
5 4
Sink (1)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Source (0)

2 9 Cost of cut = 11

1
a1 a2 a1 = 1 a2 = 1

2
E (1,1) = 11
5 4
Sink (1)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Source (0)

2 9 st-mincut cost = 8

1
a1 a2 a1 = 1 a2 = 0

2
E (1,0) = 8
5 4
Sink (1)

Solve the dual maximum flow problem

Compute the maximum flow between
Source
Source and Sink s.t.
2 9
Edges: Flow < Capacity
1
Nodes: Flow in = Flow out
v1 v2
2
5 4
Min-cutMax-flow Theorem
Sink In every network, the maximum flow
equals the cost of the st-mincut

Assuming non-negative capacity

Flow = 0
Augmenting Path Based
Algorithms
Source

2 9
1
v1 v2
2
5 4

Sink

Flow = 0
Algorithms
Source

2 9 1. Find path from source to sink
with positive capacity
1
v1 v2
2
5 4

Sink

Flow = 0 + 2
Algorithms
Source

2-2 9 1. Find path from source to sink
1
v1 v2
2. Push maximum possible flow
2 through this path
4
5-2

Sink

Flow = 2
Algorithms
Source

1
v1 v2
2 through this path
3 4

Sink

Flow = 2
Algorithms
Source

1
v1 v2
2 through this path
3 4

3. Repeat until no path can be
Sink
found

Flow = 2 + 4
Algorithms
Source

1
v1 v2
2 through this path
3 0

Sink
found

Flow = 6
Algorithms
Source

1
v1 v2
2 through this path
3 0

Sink
found

Flow = 6 + 2
Algorithms
Source

1+2
v1 v2
2-2 through this path
1 0

Sink
found

Flow = 8
Algorithms
Source

3
v1 v2
0 through this path
2 0

Sink
found

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Source (0)

2 9
2a1 + 5ā1
1
a1 a2
= 2(a1+ā1) + 3ā1
2
= 2 + 3ā1
5 4
Sink (1)

E(a1,a2) = 2 + 3ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Source (0)

0 9
2a1 + 5ā1
1
a1 a2
= 2(a1+ā1) + 3ā1
2
= 2 + 3ā1
3 4
Sink (1)

E(a1,a2) = 2 + 3ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Source (0)

0 9
9a2 + 4ā2
1
a1 a2
= 4(a2+ā2) + 5ā2
2
= 4 + 5ā2
3 4
Sink (1)

E(a1,a2) = 2 + 3ā1+ 5a2 + 4 + 2a1ā2 + ā1a2

Source (0)

0 5
9a2 + 4ā2
1
a1 a2
= 4(a2+ā2) + 5ā2
2
= 4 + 5ā2
3 0
Sink (1)

E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2

Source (0)

0 5
1
a1 a2

2
3 0
Sink (1)

E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2
3ā1+ 5a2 + 2a1ā2

= 2(ā1+a2+a1ā2) +ā1+3a2
Source (0)
= 2(1+ā1a2) +ā1+3a2
0 5
F1 = ā1+a2+a1ā2
1
a1 a2 F2 = 1+ā1a2

2 a1 a2 F1 F2
3 0 0 0 1 1
0 1 2 2
Sink (1)
1 0 1 1
1 1 1 1

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2
3ā1+ 5a2 + 2a1ā2

= 2(ā1+a2+a1ā2) +ā1+3a2
Source (0)
= 2(1+ā1a2) +ā1+3a2
0 3
F1 = ā1+a2+a1ā2
3
a1 a2 F2 = 1+ā1a2

0 a1 a2 F1 F2
1 0 0 0 1 1
0 1 2 2
Sink (1)
1 0 1 1
1 1 1 1

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

Source (0)

0 3
3 No more
a1 a2
augmenting paths
0 possible
1 0
Sink (1)

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2 Residual Graph
(positive coefficients)

Source (0)
Total Flow
bound on the 0 3
optimal solution
3
a1 a2

0
1 0
Sink (1)

Tight Bound --> Inference of the optimal solution becomes trivial

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2 Residual Graph
(positive coefficients)

Source (0)
Total Flow
bound on the 0 3 st-mincut cost = 8
optimal solution
3
a1 a2 a1 = 1 a2 = 0

0
E (1,0) = 8
1 0
Sink (1)

Tight Bound --> Inference of the optimal solution becomes trivial

Augmenting Path and Push-Relabel n: #nodes
m: #edges
U: maximum
edge weight

Algorithms
assume non-
negative edge
weights

[Slide credit: Andrew Goldberg]

Ford Fulkerson: Choose any augmenting path

Source

1000 1000
1
a1 a2

0
1000 1000
Sink


Source

1000 1000
1
a1 a2

0 Good
1000 1000 Augmenting
Paths
Sink


Source

1000 1000
1
a1 a2

0 Bad
1000 1000 Augmenting
Path
Sink


Source

999 1000
0
a1 a2

1
1000 999
Sink

n: #nodes
Ford Fulkerson: Choose any augmenting path m: #edges

Source

999 1000
0
a1 a2

1
1000 999
Sink

We will have to perform 2000 augmentations!
Worst case complexity: O (m x Total_Flow)
(Pseudo-polynomial bound: depends on flow)

n: #nodes
Dinic: Choose shortest augmenting path m: #edges

Source

1000 1000
1
a1 a2

0
1000 1000
Sink

Worst case Complexity: O (m n2)

 Specialized algorithms for vision problems
 Grid graphs
 Low connectivity (m ~ O(n))

 Dual search tree augmenting path algorithm
[Boykov and Kolmogorov PAMI 2004]
• Finds approximate shortest augmenting paths efficiently
• High worst-case time complexity
• Empirically outperforms other algorithms on vision problems

Efficient code available on
the web

http://www.adastral.ucl.ac.uk/~vladkolm/software.html

E(x) = ∑ ci xi + ∑ dij |xi-xj| E: {0,1}n → R
0 → fg
i i,j
1 → bg
n = number of
pixels

x* = arg min E(x)
x

How to minimize E(x)?

Graph *g;
For all pixels p
Source (0)
/* Add a node to the graph */
nodeID(p) = g->add_node();
/* Set cost of terminal edges */
set_weights(nodeID(p), fgCost(p), bgCost(p));
end

for all adjacent pixels p,q
add_weights(nodeID(p), nodeID(q), cost(p,q));
end

g->compute_maxflow(); Sink (1)

label_p = g->is_connected_to_source(nodeID(p));
// is the label of pixel p (0 or 1)

Graph *g;
For all pixels p
Source (0)
bgCost(a1) bgCost(a2)
set_weights(nodeID(p), fgCost(p), bgCost(p));
a1 a2
end

add_weights(nodeID(p), nodeID(q), cost(p,q)); fgCost(a1) fgCost(a2)
end



Graph *g;
For all pixels p
Source (0)
set_weights(nodeID(p), fgCost(p), bgCost(p)); cost(p,q)
a1 a2
end

end



Graph *g;
For all pixels p
Source (0)
set_weights(nodeID(p), fgCost(p), bgCost(p)); cost(p,q)
a1 a2
end

end


// is the label of pixel p (0 or 1) a1 = bg a2 = fg

Advances in Markov Random Fields for Computer Vision

 MIT Press, summer 2010

 Topics of this course and much, much more

 Contributors: usual suspects – lecturers on this course + Boykov,

Kolmogorov, Weiss, Freeman, ....

 one for the office and one for home

 www.research.microsoft.com/vision/MRFbook

 Non-submodular Energy Functions

 Mixed (Real-Integer) Problems

 Higher Order Energy Functions

 Multi-label Problems
 Ordered Labels
▪ Stereo (depth labels)
 Unordered Labels
▪ Object segmentation ( ‘car’, `road’, `person’)

E(x) = ∑ θi (xi) + ∑ θij (xi,xj)
i i,j

θij(0,1) + θij (1,0) ≤θ ij (0,0) + θij (1,1) for some ij

 Minimizing general non-submodular functions is NP-
hard.

 Commonly used method is to solve a relaxation of
the problem

[Boros and Hammer, ‘02]

unary

 pq (0,0)   pq (1,1)   pq (0,1)   pq (1,0)
pairwise submodular
~ ~ ~ ~
 pq (0,0)   pq (1,1)   pq (0,1)   pq (1,0)
pairwise nonsubmodular


Double number of variables: xp  xp , xp (xp  1 xp )



Ignore constraint and solve


Local Optimality


[Rother, Kolmogorov, Lempitsky, Szummer] [CVPR 2007]

QPBO: 0 ? ? ? ? ?
p q r s t

Probe Node p:
0 0 0 0 ? ? 0 1 0 1 0 ?
p q r s t p q r s t

What can we say about variables?
•r -> is always 0
•s -> is always equal to q
•t -> is 0 when q = 1

 Probe nodes in an order until energy
unchanged

 Simplified energy preserves global optimality
and (sometimes) gives the global minimum

 Result depends slightly on the order

0 0 0 0 0 0
? 0
? 1
? 0 0 0 1
0 0 0 0 0
? 0
? 1
? ? 0 0 1 0
0 0 0 0 ? ? ? ? 0 0 0 0
y (e.g. from BP) x (partial) y’ = FUSE(x,y)

• Property: E(y’) ≤ E(y) [autarky property]

 Non-submodular Energy Functions

 Mixed (Real-Integer) Problems

 Higher Order Energy Functions

 Multi-label Problems
 Ordered Labels
▪ Stereo (image intensity, depth)
 Unordered Labels
▪ Object segmentation ( ‘car’, `road’, `person’)

Need for a human
like segmentation

colour appearance
based Segmentation

Image Segmentation
Result

[Kumar et al, 05] [Kohli et al, 06,08]

x – binary image segmentation (xi ∊ {0,1})
ω – non-local parameter (lives in some large set Ω)

E(x,ω) = C(ω) + ∑ θi (ω, xi) + ∑ θij (ω,xi,xj)
i i,j
unary pairwise
constant
potentials potentials
≥0

ω θi (ω, xi)
Pose Shape Prior
Stickman Rough Shape
Model Prior
[Kohli et al, 06,08]


i i,j
unary pairwise
constant
≥0

ω
Template
Position
Scale
Orientation

[Kohli et al, 06,08] [Lempitsky et al, 08]


i i,j
unary pairwise
constant
≥0

{x*,ω*} = arg min E(x,ω)
x,ω

• Standard “graph cut” energy if ω is fixed

[Kohli et al, 06,08] [Lempitsky et al, 08]

Local Method: Gradient Descent over ω

ω * = arg min min E (x,ω) Submodular
ω x

ω*


Local Method: Gradient Descent over ω

ω * = arg min min E (x,ω) Submodular
ω x

E (x,ω1)
Dynamic Graph
Similar Energy Cuts
Functions
15- 20 time
E (x,ω2) speedup!


Global Method: Branch and Mincut

Produces the global optimal solution

Exhaustively explores Ω in the worst case

[Lempitsky et al, 08]

Ω (space of w) is hierarchically clustered
Ω0

Ω0

Standard best-first branch-and-bound search:
lowest lower bound
B

C
A

Small fraction of nodes is visited

30,000,000 shapes

Exhaustive search: 30,000,000 mincuts
Branch-and-Mincut: 12,000 mincuts
Speed-up: 2500 times
(30 seconds per 312x272 image)


Left ventricle epicardium tracking (work in progress)

Original sequence Branch & Bound segmentation No shape prior
Shape prior from other sequences
5,200,000 templates
≈20 seconds per frame
Speed-up 1150
Data courtesy: Dr Harald Becher, Department of Cardiovascular Medicine, University of Oxford


 Pairwise functions have limited expressive power
 Inability to incorporate region based likelihoods and priors

Field of Experts Model
[Roth & Black CVPR 2005 ]
[Potetz, CVPR 2007]

Minimize Curvature
[Woodford et al. CVPR 2008 ]

Other Examples:
[Rother, Kolmogorov, Minka & Blake, CVPR 2006]
[Komodakis and Paragios, CVPR 2009]
[Rother, Kohli, Feng, Jia, CVPR 2009]
[Ishikawa, CVPR 2009]
And many others ...

n = number of pixels
E: {0,1}n → R
0 →fg, 1→bg

E(X) = ∑ ci xi + ∑ dij |xi-xj|
i i,j

Image Unary Cost Segmentation

[Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04]

[Kohli et al. ‘07]

Patch Dictionary
(Tree)

h(Xp) = { C1 if xi = 0, i ϵ p
Cmax otherwise
p
Cmax  C1

E: {0,1}n → R
0 →fg, 1→bg

E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp)
i i,j p

h(Xp) = { C1 if xi = 0, i ϵ p
Cmax otherwise

p

E: {0,1}n → R
0 →fg, 1→bg

E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp)
i i,j p

Image Pairwise Segmentation Final Segmentation


Exact
Higher Order Transformation Pairwise Submodular
Submodular
Function
Functions
?
Billionnet and M. Minoux [DAM 1985] st-mincut
Kolmogorov & Zabih [PAMI 2004] S
Freedman & Drineas [CVPR2005]
Kohli Kumar Torr [CVPR2007, PAMI 2008]
Kohli Ladicky Torr [CVPR 2008, IJCV 2009]
Ramalingam Kohli Alahari Torr [CVPR 2008]
Zivny et al. [CP 2008] T

Higher Order Pairwise
Function Submodular
Function

Identified transformable families of higher order function s.t.
1. Constant or polynomial number of auxiliary variables (a) added
2. All pairwise functions (g) are submodular

Example: H (X) = F ( ∑ xi )

concave

H (X)

0 ∑ xi


 Simple Example using Auxiliary variables

f(x) = { 0 if all xi = 0
C1 otherwise
x ϵ L = {0,1}n

min f(x) = min C1a + C1 ∑ ā xi
x x,a ϵ {0,1}
Higher Order Quadratic Submodular
Submodular Function Function

∑xi < 1 a=0 (ā=1) f(x) = 0
∑xi ≥ 1 a=1 (ā=0) f(x) = C1

x x,a ϵ {0,1}

C1∑xi

C1

1 2 3

∑xi

x x,a ϵ {0,1}

C1∑xi

a=0 a=1
Lower envelop
of concave
C1 functions is
concave

1 2 3

∑xi

min f(x)
x
= min f1 (x)a + f2(x)ā
x,a ϵ {0,1}

f2(x)

f1(x) Lower envelop
of concave
functions is
concave

1 2 3

∑xi

min f(x)
x
= min f1 (x)a + f2(x)ā
x,a ϵ {0,1}

f2(x)

a=0 a=1
f1(x) Lower envelop
of concave
functions is
concave

1 2 3

∑xi

 Transforming Potentials with 3 variables
[Woodford, Fitzgibbon, Reid, Torr, CVPR 2008]

 Transforming general “sparse” higher order functions
[Rother, Kohli, Feng, Jia, CVPR 2009]
[Ishikawa, CVPR 2009]
[Komodakis and Paragios, CVPR 2009]

Training
Image

Pairwise
Energy
P(x)

Test Image Test Image Minimized using Result
(60% Noise) st-mincut or
max-product
message passing

Training
Image
Higher Order Structure
not Preserved

Pairwise
Energy
P(x)

Test Image Test Image Minimized using Result
(60% Noise) st-mincut or
max-product
message passing

Minimize: E(X) = P(X) + ∑ hc (Xc)
c
Where: hc: {0,1}|c| → R

Higher Order Function (|c| = 10x10 = 100)
Assigns cost to 2100 possible labellings!

Exploit function structure to transform
it to a Pairwise function
p1 p2 p 3

Training Learned
Image Patterns

Test Image Test Image Pairwise Higher-Order
(60% Noise) Result Result
[Joint work with Carsten Rother ]

Min
y
E(y) = ∑ fi (yi) + ∑ gij (yi,yj)
i i,j
y ϵ Labels L = {l1, l2, … , lk}

 Exact Transformation to QPBF
[Roy and Cox ’98] [Ishikawa ’03] [Schlesinger & Flach ’06]
[Ramalingam, Alahari, Kohli, and Torr ’08]

 Move making algorithms

So what is the problem?

Em (y1,y2, ..., yn) Eb (x1,x2, ..., xm)

yi ϵ L = {l1, l2, … , lk} xi ϵ L = {0,1}

Multi-label Problem Binary label Problem
such that:
Let Y and X be the set of feasible solutions, then
1. One-One encoding function T:X->Y

2. arg min Em(y) = T(arg min Eb(x))

• Popular encoding scheme
[Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06]

# Nodes = n * k
# Pairwise = m * k2


Schlesinger & Flach ’06:

E(y) = ∑ θi (yi) + ∑ θij (yi,yj)
i i,j


θij(li+1,lj) + θij (li,lj+1)  θij (li,lj) + θij (li+1,lj+1)

# Nodes = n * k
lj +1
# Pairwise = m * k2 li +1
lj
li

Image MAP Scanline
Solution algorithm

[Roy and Cox, 98]

Other “less known” algorithms

Unary Pair-wise Complexity
Potentials Potentials
Ishikawa Arbitrary Convex and T(nk, mk2)
Transformation Symmetric
[03]
Schlesinger Arbitrary Submodular T(nk, mk2)
Transformation
[06]
Hochbaum Linear Convex and T(n, m) + n log k
[01] Symmetric
Hochbaum Convex Convex and O(mn log n log nk)
[01] Symmetric

T(a,b) = complexity of maxflow with a nodes and b edges

Min
y
E(y) = ∑ fi (yi) + ∑ gij (yi,yj)
i i,j

 Exact Transformation to QPBF

 Move making algorithms
[Boykov , Veksler and Zabih 2001] [Woodford, Fitzgibbon, Reid, Torr, 2008]
[Lempitsky, Rother, Blake, 2008] [Veksler, 2008] [Kohli, Ladicky, Torr 2008]

Energy

Solution Space

Current Solution
Search
Neighbourhood
Optimal Move
Energy

Solution Space

Current Solution
Search
Neighbourhood
Optimal Move
xc
Key Property
Energy

(t)

Move Space

Solution Space

Bigger move • Better solutions
space • Finding the optimal move hard

Minimizing Pairwise Functions
[Boykov Veksler and Zabih, PAMI 2001]
• Series of locally optimal moves
• Each move reduces energy
• Optimal move by minimizing submodular function

Current Solution
Search Neighbourhood
Move Space (t) : 2n
n Number of Variables
L Number of Labels
Space of Solutions (x) : Ln

Extend to minimize Higher order Functions
Kohli et al. ‘07, ‘08, ‘09

x = t x1 + (1-t) x2
New Current Second
solution Solution solution

Em(t) = E(t x1 + (1-t) x2)
Minimize over move variables t

For certain x1 and x2, the move energy is sub-modular QPBF
[Boykov , Veksler and Zabih 2001]

• Variables labeled α, β can swap their labels


Tree
Ground
Swap Sky, House House
Sky



 Move energy is submodular if:
 Unary Potentials: Arbitrary
 Pairwise potentials: Semi-metric

θij (la,lb) ≥ 0
θij (la,lb) = 0 a = b

Examples: Potts model, Truncated Convex


• Variables take label a or retain current label

[Boykov, Veksler, Zabih]

Tree
Ground
House
Status: Expand Ground
Initialize with Tree
Sky
House Sky



Semi metric
 Move energy is submodular if: +
 Unary Potentials: Arbitrary Triangle
Inequality
 Pairwise potentials: Metric

θij (la,lb) + θij (lb,lc) ≥ θij (la,lc)

Examples: Potts model, Truncated linear

Cannot solve truncated quadratic


 Expansion and Swap can be derived as a primal dual scheme

 Get solution of the dual problem which is a lower bound on
the energy of solution

 Weak guarantee on the solution E(x) < 2(dmax /dmin) E(x*)

θij (li,lj) = g(|li-lj|)

dmax

dmin

|yi-yj|

[Komodakis et al 05, 07]

x = t x1 + (1-t) x2
New First Second
solution solution solution

Minimize over move variables t

Move Type First Second Guarantee
Solution Solution
Expansion Old solution All alpha Metric
Fusion Any solution Any solution 
Move functions can be non-submodular!!

x = t x1 + (1-t) x2
x1, x2 can be continuous Optical Flow
Example

Solution
from
Method 2 x2

Solution
from F Final
Method 1 x1 x Solution

[Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008]

 Move variables can be multi-label

x = (t ==1) x1 + (t==2) x2 +… +(t==k) xk
 Optimal move found out by using the Ishikawa Transform
 Useful for minimizing energies with truncated convex
pairwise potentials

T
θij (yi,yj) = min(|yi-yj|2,T)
θij (yi,yj)
|yi-yj|

[Veksler, 2007]

Noisy
Image
Image

Range
Expansion
Moves
Move

[Veksler, 2008]

3,600,000,000 Pixels
Created from about 800 8 MegaPixel Images

[Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]

[Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]

 Processing Videos
1 minute video of 1M pixel resolution
 3.6 B pixels

 3D reconstruction
[500 x 500 x 500 = .125B voxels]

Segment

First Frame Can we do
better?

Segment

Second Frame
Kohli[Kohli & Torr, ICCV05 PAMI07]
& Torr (ICCV05, PAMI07)

Image

Flow Segmentation

Kohli[Kohli & Torr, ICCV05 PAMI07]
& Torr (ICCV05, PAMI07)

minimize
Frame 1 EA SA
Reuse
Computation

differences
between EB* Simpler
A and B

fast

minimize
Frame 2 EB SB 3–100000
time speedup!
Reparametrization

[Kohli & Torr, ICCV05 PAMI07] [Komodakis & Paragios, CVPR07]

Original Energy
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2
Reparametrized Energy
E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

New Energy
E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 7a1ā2 + ā1a2
New Reparametrized Energy
E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2 + 5a1ā2

[Kohli & Torr, ICCV05 PAMI07]& [Komodakis & Paragios, CVPR07]
Kohli Torr (ICCV05, PAMI07)

Approximation
Original
algorithm Approximate
Problem
(Large) Solution
(Slow)

Fast partially
optimal
algorithm

[ Alahari Kohli & Torr CVPR ‘08]

Approximation
Original
algorithm Approximate
Problem
(Large) Solution
(Slow)

Fast partially
optimal algorithm
[Kovtun ‘03]
Fast partially
[ Kohli et al. ‘09]
optimal
algorithm Reduced Problem Approximation
algorithm
Solved Problem
Approximate
(Global Optima) (Fast) Solution


3- 100
Times Speed up

sky
Tree Reweighted
Original Building
Message Passing Approximate
Problem Airplane
(Large) Solution
(9.89 sec)
Grass
Fast partially
optimal algorithm
[Kovtun ‘03]
Fast partially
[ Kohli et al. ‘09]
optimal Tree Reweighted sky
algorithm sky
Reduced Problem Message Passing Building
Building Approximate
Solved Problem Airplane
Solution
Airplane Total Time
(Global Optima)
(0.30 sec) Grass
Grass


 Minimization with Complex Higher Order
Functions
 Connectivity
 Counting Constraints

 Hybrid algorithms
 Connections between Messages Passing
algorithms BP, TRW, and graph cuts

NP-Hard
MAXCUT

CSP

Space of Problems

 Which functions are exactly solvable?

 Approximate solutions of NP-hard problems

 Scalability and Efficiency

 Which functions are exactly solvable?
Boros Hammer [1965], Kolmogorov Zabih [ECCV 2002, PAMI 2004] , Ishikawa [PAMI 2003],
Schlesinger [EMMCVPR 2007], Kohli Kumar Torr [CVPR2007, PAMI 2008] , Ramalingam Kohli
Alahari Torr [CVPR 2008] , Kohli Ladicky Torr [CVPR 2008, IJCV 2009] , Zivny Jeavons [CP 2008]

 Approximate solutions of NP-hard problems
Schlesinger [76 ], Kleinberg and Tardos [FOCS 99], Chekuri et al. [01], Boykov et al. [PAMI 01],
Wainwright et al. [NIPS01], Werner [PAMI 2007], Komodakis et al. [PAMI, 05 07], Lempitsky et
al. [ICCV 2007], Kumar et al. [NIPS 2007], Kumar et al. [ICML 2008], Sontag and Jakkola [NIPS
2007], Kohli et al. [ICML 2008], Kohli et al. [CVPR 2008, IJCV 2009], Rother et al. [2009]

 Scalability and Efficiency
Kohli Torr [ICCV 2005, PAMI 2007], Juan and Boykov [CVPR 2006], Alahari Kohli Torr [CVPR
2008] , Delong and Boykov [CVPR 2008]

 Iterated Conditional Modes (ICM)
Classical Move making
 Simulated Annealing algorithms
 Dynamic Programming (DP)
 Belief Propagtion (BP) Message passing
 Tree-Reweighted (TRW), Diffusion

 Graph Cut (GC)
Combinatorial Algorithms
 Branch & Bound

 Relaxation methods: Convex Optimization
(Linear Programming,
...)
 …

ICCV2009: MAP Inference in Discrete Models: Part 4

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (18)

Similaire à ICCV2009: MAP Inference in Discrete Models: Part 4

Similaire à ICCV2009: MAP Inference in Discrete Models: Part 4 (20)

Plus de zukun

Plus de zukun (20)

Dernier

Dernier (20)

ICCV2009: MAP Inference in Discrete Models: Part 4