Module for Grade 9 for Asynchronous/Distance learning
Patch Matching with Polynomial Exponential Families and Projective Divergences
1. Patch matching with polynomial
exponential families and
projective divergences
Frank Nielsen 1 Richard Nock2
1École Polytechnique & Sony CSL
2Data61 & Australian National University
9th Similarity Search and Applications
2. 1/27
The “Patch Matching” problem
Given a query patch, patch matching = finding “similar” patches
in a target image.
Handle noise, symmetry [16, 19], zoom, smooth deformation, etc.
3. 2/27
Patch Matching: Baseline (naïve) algorithm
Sum of Square Differences (SSDs) of the pixel intensities:
D(Is, It(x0)) =
x∈[1,ws ]×[1,hs ]
(Is(x) − It(x + x0))2
(= squared Euclidean distance on vectorized patch intensities)
but small image shift can cause large 2
2 distance
shift [18]
Naïve brute-force algorithm in O(wthtwshs) too time
consuming!
→ extend to multiple color channels with sum/max
Alternative method: Fourier phase correlation method [11, 8]
in O(wh log(wh)) but limited to translations
Factorize computations using Nearest Neighbor Fields [3, 4]
(+ GPU)
4. 3/27
Large patch PM application: Zoom stitching
Nested image mosaicing: one image is a patch fully contained
inside another image
5. 4/27
Experiments: brute-force baseline algorithm
Image of size 1280, patch of size 120 × 120
SSD 30.7 sec
SSDRGBSum 96.7 sec
SSDRGBMax 208.7 sec
HP Elitebook 840 G1 (i7-4600U CPU 2.1 GHz with 8 GB RAM)
8. 7/27
PEF-PM: Polynomial Exponential Family-PM
Polynomial exponential family:
p(x; θ) = exp( D
i=1 θi xi − F(θ))
t(x) = (x1, . . . , xD) = sufficient statistics [5, 1], θ = natural
parameters. x, y = x y ∈ R = Euclidean inner product.
F(θ) = log exp( θ, t(x) )dx = log-normalizer
→ However F = log exp( θ, t(x) )dx not available in
closed-form!
Estimate θ using Score Matching Estimator [10]
(SME, instead of Maximum Likelihood Estimator, MLE)
Define projective distance D between PEFs to cope with
unnormalized distributions:
D(λq, λ q ) = D(q, q ) for any λ, λ > 0
Accelerate batched SMEs using integral images
= Summed Area Tables (SATs)
9. 8/27
Score Matching Estimator [10] (SME) for PEFs
Divergence estimator: minimize the Fisher divergence:
J(p : q) =
1
2
d
i=1
∂ log p(x)
∂xi
−
∂ log q(x)
∂xi
2
p(x)dx,
with J(p : q) = 0 iff p = λq for λ > 0. Fisher divergence is
one-sided projective divergence: J(p : λq) = J(p : q).
For univariate Polynomial Exponential Families, SME:
θSM(s(x)) = − (Es[A(x)])−1
(Es[b(x)])
where A(x) = [ti (x)tj (x)] (D × D symmetric matrix) and
b(x) = [t1 (x) . . . tD(x)] (D-dimensional column vector). For
PEFs, ti (x) = ixi−1 and ti (x) = i(i − 1)xi−2, and therefore
A = Es[A(x)] = [aij = ijMi+j−2(s)] and
b = Es[b(x)] = [bj = j(j − 1)Mj−2(s)], where
Ml (s) = 1
s i xl
i is the l-th sample raw moment.
10. 9/27
Accelerating batched SMEs for PEFs using SATs
To perform Score Matching in OD(1), use SATs:
Summed Area Tables [7] (SATs), also called integral
images:
∀(x, y), F(x, y) =
x ≤x,y ≤y
f (x , y )
Single pass computation in preprocessing
Then patch PEF SME (f = t , t , a rectangle bottom-left
corner n(x0, y0) and top-right corner (x1, y1)) in O(1) time:
x0≤x≤x1,y0≤y≤y1
f (x, y) = F(x1, y1)+F(x0, y0)−F(x0, y1)−F(x1, y0)
11. 10/27
Similarity between PEFs: Projective divergences
Relative entropy called the Kullback-Leibler divergence:
KL(p(x; θs), p(x; θt)) =
x∈R+
p(x; θs) log
p(x; θs)
p(x; θt)
dx
amounts to Bregman divergence for exponential families [2], but
need F in closed-form.
Projective γ-divergence [9, 15, 6] (for γ > 0):
Dγ(λq, λ q ) = Dγ(q, q ) for any λ, λ > 0
Dγ(p, q) =
1
γ(γ − 1)
log
pγ(x)q1−γ(x)dx)
( p(x)dx)γ( q(x)dx)1−γ
Dγ(p, q) =
1
γ(1 + γ)
log Iγ(p, p) −
1
γ
log Iγ(p, q) +
1
1 + γ
log Iγ(q, q)
where
Iγ(p, q) =
x∈X
p(x)q(x)γ
dx.
12. 11/27
Projective γ-divergences for exponential families
Use the symmetrized γ-divergence Sγ for patch similarity.
When γ → 0, Dγ(p, q) → KL(p, q) (Bregman divergence for
EFs)
Iγ(θp, θq) = exp (F(θp + γθq) − F(θp) − γF(θq)) .
Provided that the γθp + θq ∈ Θ. Condition always satisfied
when the natural parameter space [12, 14, 13] is a cone (since
γ > 0), like the multivariate Gaussians distributions, the
multinomial distributions, and the Wishart distributions, just
to name a few.
13. 12/27
Monte-Carlo stochastic estimation of Sγ
Although projective divergence, exact Dγ depends on the
untractable log-normalizer!
Approximate Iγ(p, q) by discretizing the integral using
quadrature rules
Monte-Carlo stochastic integration (γ = 0.1, s = 100000):
Iγ(p, q) =
x∈X
p(x)q(x)γ
dx
1
m
m
i=1
q(xi )γ
with x1, ..., xs ∼ p(x) (iid)
→ Sample variance, confidence interval (CI) of Monte-Carlo
estimator.
14. 13/27
Results of PEF-PM
Impact of the polynomial degree of PEFs, the order D of the
exponential family
order D = 3 order D = 4
order D = 5 order D = 6
Image size 1280 × 853, patch size 100 × 100
16. 15/27
PEFs handles patch symmetries (→ bag of pixels)
Patch matching with symmetry (reflection) detected by PEF-PM of
order 6
patch size 250 × 250, image dimension (960, 640)
17. 16/27
Summary of contributions
Model patch colors using Polynomial Exponential Families
(PEFs), universal smooth (positive) densities
Estimate PEFs using Score Matching Estimator, and
accelerate the batch estimator using Summed Area Tables
(SATs)
Use a statistical projective divergence: the symmetrized
γ-divergence.
Closed-form expression for EFs with log-normalizer F
Monte-Carlo estimation without explicit log-normalizer
Handle noise robustly (γ-divergence), symmetries (color
distributions), etc. (see paper)
22. 21/27
References I
S.-i. Amari.
Information Geometry and Its Applications.
Applied Mathematical Sciences. Springer Japan, 2016.
Arindam Banerjee, Srujana Merugu, Inderjit S Dhillon, and Joydeep Ghosh.
Clustering with Bregman divergences.
The Journal of Machine Learning Research, 6:1705–1749, 2005.
Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman.
Patchmatch: a randomized correspondence algorithm for structural image editing.
ACM Transactions on Graphics (TOG), 28(3):24, 2009.
Connelly Barnes, Eli Shechtman, Dan B Goldman, and Adam Finkelstein.
The generalized patchmatch correspondence algorithm.
In European Conference on Computer Vision (ECCV), pages 29–43. Springer, 2010.
Lawrence D Brown.
Fundamentals of statistical exponential families with applications in statistical decision theory.
Lecture Notes-monograph series, 9:i–279, 1986.
Ting-Li Chen, Dai-Ni Hsieh, Hung Hung, I-Ping Tu, Pei-Shien Wu, Yi-Ming Wu, Wei-Hau Chang,
Su-Yun Huang, et al.
γ-sup: A clustering algorithm for cryo-electron microscopy images of asymmetric particles.
The Annals of Applied Statistics, 8(1):259–285, 2014.
Franklin C Crow.
Summed-area tables for texture mapping.
ACM SIGGRAPH computer graphics, 18(3):207–212, 1984.
23. 22/27
References II
Hassan Foroosh, Josiane B Zerubia, and Marc Berthod.
Extension of phase correlation to subpixel registration.
IEEE Transactions onImage Processing, 11(3):188–200, 2002.
Hironori Fujisawa and Shinto Eguchi.
Robust parameter estimation with a small bias against heavy contamination.
Journal of Multivariate Analysis, 99(9):2053–2081, 2008.
Aapo Hyvärinen.
Estimation of non-normalized statistical models by score matching.
Journal of Machine Learning Research, 6(Apr):695–709, 2005.
CD Kuglin.
The phase correlation image alignment method.
In Proc. Int. Conf. on Cybernetics and Society, 1975, pages 163–165, 1975.
Frank Nielsen.
Closed-form information-theoretic divergences for statistical mixtures.
In Pattern Recognition (ICPR), 21st International Conference on, pages 1723–1726. IEEE, 2012.
Frank Nielsen and Richard Nock.
A closed-form expression for the Sharma-Mittal entropy of exponential families.
Journal of Physics A: Mathematical and Theoretical, 45(3):032003, 2011.
Frank Nielsen and Richard Nock.
On the chi square and higher-order chi distances for approximating f -divergences.
IEEE Signal Processing Letters, 1(21):10–13, 2014.
Akifumi Notsu, Osamu Komori, and Shinto Eguchi.
Spontaneous clustering via minimum gamma-divergence.
Neural computation, 26(2):421–448, 2014.
24. 23/27
References III
Viorica Patraucean, Rafael Gioi, and Maks Ovsjanikov.
Detection of mirror-symmetric image patches.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops,
pages 211–216, 2013.
Carsten Rother, Vladimir Kolmogorov, and Andrew Blake.
Grabcut: Interactive foreground extraction using iterated graph cuts.
ACM transactions on graphics (TOG), 23(3):309–314, 2004.
Lucas Theis, Aäron van den Oord, and Matthias Bethge.
A note on the evaluation of generative models.
arXiv preprint arXiv:1511.01844, 2015.
ICLR 2016.
Z. Wang, Z. Tang, and X. Zhang.
Reflection symmetry detection using locally affine invariant edge correspondence.
IEEE Transactions on Image Processing, 24(4):1297–1301, April 2015.
25. 24/27
Perspectives and future work
Model selection of the polynomial exponential family according
to the query patch
Foreground/background detection in patches (say, using
Grabcut [17]) and matching only the foreground statistical
distributions to improve the accuracy of patch matching.
Modeling by distributions allows us to discard rectangular
windows.
Multivariate PEFs to bypass sum/max of univariate PEF color
channel distances
Handle clamped support [0, 255] instead of [0, +∞)
26. 25/27
Qualitative results of PEF-PM
patch size 150 × 150 and image size 960 × 640
aligned pixel-based (SSD) PEF (D = 4) with Sγ
Intensity
Sum SSD
28. 27/27
Acceptance-rejection sampling for PEFs
f (x) ∝ ePθ(x) with polynomial P(x) = D
i=1 θi xi ,
θ = (θ1, ..., θD) defined on support X = [0, 255].
Computationally intractable normalizer Z(θ) = x∈X ePθ(x)dx.
Proposal distribution g(y) = 1
255 , uniform density on X
f (x) ≤ Mθg(x) for Mθ = 255 exp(maxx∈X Pθ(x)).
maxx∈X Pθ(x) ≤ D
i=1 maxx∈X θi xi = D
i=1(θi 255i )+. In
particular, Mθ = 255 when ∀i, θi ≤ 0.
Draw variate u ∼ Unif(0, 1) and variate y ∼ Unif(0, 255). If
u < f (y)
Mθg(y) accept y as variate of f (x), otherwise reject it.
Expected M iterations for drawing a variate of f (x)