SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
Classification with mixtures of
curved Mahalanobis metrics
— or LMNN in Cayley-Klein geometries —
arXiv:1609.07082
Frank Nielsen1,2 Boris Muzellec1 Richard Nock3,4,5
1Ecole Polytechnique, France 2Sony CSL, Japan 3Data61, Australia 4ANU,
Australia 4The University of Sydney, Australia
26th September 2016
1
Mahalanobis distances
For Q 0, a symmetric positive definite matrix like a
covariance matrix, define Mahalanobis distance:
DQ(p, q) = (p − q) Q(p − q)
Metric distance (indiscernibles/symmetry/triangle inequality)
Eg., Q = precision matrix Σ−1, where Σ = covariance matrix
Generalize Euclidean distance when Q = I: DI (p, q) = p − q
Mahalanobis distance interpreted as Euclidean distance after
Cholesky decomposition Q = L L and affine transformation
x ← L x:
DQ(p, q) = DI (L p, L q) = p − q
2
Generalizing
Mahalanobis distances
with Cayley-Klein
projective geometries
+
Learning in Cayley-Klein
spaces
3
Cayley-Klein geometry: Projective geometry [7, 3]
RPd
: (λx, λ) ∼ (x, 1)
homogeneous coordinates x → ˜x = (x, w = 1), and
dehomogeneization by “perspective division” ˜x → x
w
cross-ratio measure is invariant by
projectivity/homography/collineation:
(p, q; P, Q) =
|p − P||q − Q|
|p − Q||q − P|
where p, q, P, Q are collinear
P
p
q
Q
4
Definition of Cayley-Klein geometries
A Cayley-Klein geometry is K = (F, cdist, cangle):
1. A fundamental conic: F
2. A constant unit cdist ∈ C for measuring distances
3. A constant unit cangle ∈ C for measuring angles
See monograph [7]
5
Distance in Cayley-Klein geometries
dist(p, q) = cdist Log((p, q; P, Q))
where P and Q are intersection points of line l = (pq) (˜l = ˜p × ˜q
in 2D) with the conic.
Log is principal complex logarithm (modulo 2πi)
p
q
Q
P
F
l
6
Key properties of Cayley-Klein distances
dist(p, p) = 0 (law of indiscernibles)
Signed distances : dist(p, q) = −dist(q, p)
When p, q, r are collinear
dist(p, q) = dist(p, r) + dist(r, q)
Geodesics in Cayley-Klein geometries are straight lines
(eventually clipped within the conic domain)
Logarithm is transferring multiplicative properties of the cross-ratio
to additive properties of Cayley-Klein distances.
When p, q, P, Q are collinear:
(p, q; P, Q) = (p, r; P, Q) · (r, q; P, Q)
7
Dual conics
In projective geometry, points and lines are dual concepts
Dual parameterizations of the fundamental conic F = (A, A∆)
Quadratic form QA(x) = ˜x A˜x
primal conic = set of border points: CA = {˜p : QA(˜p) = 0}
dual conic = set of tangent hyperplanes:
C∗
A = {˜l : QA∆ (˜l) = 0}
A∆ = A−1|A| is the adjoint matrix
Adjoint can be computed even when A is not invertible (|A| = 0)
8
Taxonomy
Signature of matrix = sign of eigenvalues of its eigen decomposition
Type A A∆ Conic
Elliptic (+, +, +) (+, +, +) non-degenerate complex conic
Hyperbolic (+, +, −) (+, +, −) non-degenerate real conic
Dual Euclidean (+, +, 0) (+, +, 0) Two complex lines with a real intersection point
Dual Pseudo-euclidean (+, −, 0) (+, 0, 0) Two real lines with a double real intersection point Deux
Euclidean (+, 0, 0) (+, +, 0) Two complex points with a double real line passing through
Pseudo-euclidean (+, 0, 0) (+, −, 0) Two complex points with a double real line passing through
Galilean (+, 0, 0) (+, 0, 0) Double real line with a real intersection point
Degenerate cases are obtained as limit of non-degenerate cases.
Measurements can be elliptic, hyperbolic or parabolic (degenerate
case).
9
Real CK distances without cross-ratio expressions
For real Cayley-Klein measures, we choose the constants:
Constants (κ is curvature):
Elliptic (κ > 0): cdist = κ
2i
Hyperbolic (κ < 0): cdist = −κ
2
Bilinear form Spq = (p , 1) S(q, 1) = ˜p S ˜q
Get rid of cross-ratio using:
(p, q; P, Q) =
Spq + S2
pq − SppSqq
Spq − S2
pq − SppSqq
10
Elliptic Cayley-Klein metric distance
dE (p, q) =
κ
2i
Log


Spq + S2
pq − SppSqq
Spq − S2
pq − SppSqq


dE (p, q) = κ arccos
Spq
SppSqq
Notice that dE (p, q) < κπ, domain DS = Rd in elliptic case.
yx
y’x’
Gnomonic projection dE (x, y) = κ · arccos ( x , y )
11
Hyperbolic Cayley-Klein distance
When p, q ∈ DS := {p : Spp < 0}, the hyperbolic domain:
dH(p, q) = −
κ
2
log


Spq + S2
pq − SppSqq
Spq − S2
pq − SppSqq


dH(p, q) = −κ arctanh 1 −
SppSqq
S2
pq
dH(p, q) = −κ arccosh
Spq
SppSqq
with arccosh(x) = log(x +
√
x2 − 1) and arctanh(x) = 1
2 log 1+x
1−x .
Curvature κ < 0
12
Decomposition of the bilinear form [1]
Write S =
Σ a
a b
= SΣ,a,b with Σ 0.
Sp,q = ˜p S ˜q = p Σq + p a + a q + b
Let µ = −Σ−1a ∈ Rd (a = −Σµ) and b = µ Σµ + sign(κ) 1
κ2
κ =
(b − µ µ)−1
2 b > µ µ
−(µ µ − b)−1
2 b < µ µ
Then the bilinear form writes as:
S(p, q) = SΣ,µ,κ(p, q) = (p − µ) Σ(q − µ) + sign(κ)
1
κ2
13
Curved Mahalanobis metric distances
We have [1]:
lim
κ→0+
DΣ,µ,κ(p, q) = lim
κ→0−
DΣ,µ,κ(p, q) = DΣ(p, q)
Mahalanobis distance DΣ(p, q) = DΣ,0,0(p, q)
Thus hyperbolic/elliptic Cayley-Klein distances can be interpreted
as curved Mahalanobis distances, or κ-Mahalanobis distances
When S = diag(1, 1, ..., 1, −1), we recover the canonical hyperbolic
distance [5] in Cayley-Klein model:
Dh(p, q) = arccosh
1 − p, q
1 − p, p 1 − q, q
defined inside the interior of a unit ball.
14
Cayley-Klein bisectors are affine
Bisector Bi(p, q):
Bi(p, q) = {x ∈ DS : distS (p, x) = distS (x, q)}
S(p, x)
S(p, p)
=
S(q, x)
S(q, q)
arccos and arccosh are monotonically increasing functions.
x, |S(p, p)|Σq − |S(q, q)|Σp
+ |S(p, p)|(a (q + x) + b) − |S(q, q)|(a (p + x) + b) = 0
Hyperplanes (restricted to the domain) 15
Cayley-Klein Voronoi diagrams are affine
Can be computed from equivalent (clipped) power diagrams [2, 5]
https://www.youtube.com/watch?v=YHJLq3-RL58
16
Cayley-Klein balls
Blue: Mahalanobis Red: elliptic Green: Hyperbolic
Cayley-Klein balls have Mahalanobis ball shapes with displaced
centers
17
Learning curved
Mahalanobis metrics
18
Large Margin Nearest Neighbors [8], LMNN
Learn Mahalanobis distance M = L L 0 for a given input
data-set P
Distance of each point to its target neighbors shrink, pull(L)
S = {(xi , xj ) : yi = yj and xj ∈ N(xj )}
Keep a distance margin of each point to its impostors, push(L)
R = {(xi , xj , xl ) : (xi , xj ) ∈ S and yi = yl }
http://www.cs.cornell.edu/~kilian/code/lmnn/lmnn.html
19
LMNN: Cost function and optimization
Objective cost function [8]: convex and piecewise linear (SDP)
pull(L) = Σi,i→j L(xi − xj ) 2
,
push(L) = Σi,i→j Σj (1 − yil ) 1 + L(xi − xj ) 2
− L(xi − xl ) 2
+
,
(L) = (1 − µ) pull(L) + µ push(L)
i → j: xj is a target neighbor of xi
yil = 1 iff xi and xj have same label, yil = 0 otherwise.
µ set by cross-validation
Optimize by gradient descent: (Lt+1) = (Lt) − γ ∂ (Lt )
∂L
∂
∂L
= (1 − µ)Σi,i→j Cij + µΣ(i,j,l)∈Rt
(Cij − Cil )
where Cij = (xi − xj ) (xi − xj )
Easy, no projection mechanism like for Mahalanobis Metric for
Clustering (MMC) [9]
20
Elliptic Cayley-Klein LMNN [1], CVPR 2015
(L) = (1 − µ)
i,i→j
dE (xi , xj ) + µ
i,i→j l
(1 − yil )ζijl
with ζijl = [1 + dE (xi , xj ) − dE (xi , xl )]+ (hinge loss)
∂ (L)
∂L
= (1 − µ)
i,i→j
∂dE (xi , xj )
∂L
+ µ
i,i→j l
(1 − yil )
∂ζijl
∂L
Cij = (xi , 1) (xj , 1)
∂dE (xi , xj )
∂L
=
k
Sii Sjj − S2
ij
L
Sij
Sii
Cii +
Sij
Sjj
Cjj − (Cij + Cji )
∂ζijl
∂L
=
∂dE (xi ,xj )
∂L − ∂dE (xi ,xl )
∂L , if ζijl ≥ 0,
0, otherwise.
21
Hyperbolic Cayley-Klein LMNN (new case)
To ensure S keeps correct signature (1, d, 0) during the LMNN
gradient descent, we decompose S = L DL (with L 0) and
perform a gradient descent on L with the following gradient:
∂dH(xi , xj )
∂L
=
k
S2
ij − Sii Sjj
DL
Sij
Sii
Cii +
Sij
Sjj
Cjj − (Cij + Cji )
Recall two difficulties of hyperbolic case compared to elliptic case:
Hyperbolic Cayley-Klein distance may be very large
(unbounded vs. < κπ for elliptic case)
Data-set should be contained inside the compact domain DS
22
HCK-LMNN: Initialization and learning rate
Initialize L =
L
1
and D so that P ∈ DS with
Σ−1 = L L (eg., precision matrix of P).
D =





−1
...
−1
κ maxx L x 2





with κ > 1.
At iteration t, it may happen that P ∈ DSt since we do not
know the optimal learning rate γ. When this happens, we
reduce γ ← γ
2 , otherwise we let γ ← 1.01γ.
23
Curved Mahalanobis learning: Results
Experimental results on some UCI data-sets
k Data-set elliptic Hyperbolic Mahalanobis
1 wine 0.989 0.865 0.984
vowel 0.832 0.797 0.827
balance 0.924 0.891 0.846
pima 0.726 0.706 0.709
3 wine 0.983 0.871 0.984
vowel 0.828 0.782 0.827
balance 0.917 0.911 0.846
pima 0.706 0.695 0.709
5 wine 0.983 0.984
vowel 0.826 0.805 0.827
balance 0.907 0.895 0.846
pima 0.714 0.712 0.709
11 wine 0.994 0.983 0.984
vowel 0.839 0.767 0.827
balance 0.874 0.897 0.846
pima 0.713 0.698 0.709
For classification, enough to consider κ ∈ {−1, 0, +1}
24
Spectral decomposition and fast proximity queries
Avoid to compute dE or dH for arbitrary S
Apply spectral decomposition (elliptic case S = L L, or
hyperbolic case S = L DL ) and perform coordinate changes
so that we consider the canonical metric distances:
dE (x , y ) = arccos
x , y
x y
,
dH(x , y ) = arccosh
1 − x , y
1 − x , x 1 − y , y
Proximity query: Eg, Vantage Point Tree
data-structures [10, 6] (with metric pruning).
25
Mixed curved Mahalanobis distance
d(x, y) = αdE (x, y) + (1 − α)dH(x, y)
1. Sum of Riemannian metric distances is metric (“blending”
positive with negative constant curvatures)
2. Mixed of bounded distance (elliptic CK) with unbounded
distance (hyperbolic CK), hyperparameter tuning α
Datasets Mahalanobis elliptic Hyperbolic Mixed α β = (1 − α)
Wine 0.993 0.984 0.893 0.986 0.741 0.259
Sonar 0.733 0.788 0.640 0.802 0.794 0.206
Balance 0.846 0.910 0.904 0.920 0.440 0.560
Pima 0.709 0.712 0.699 0.720 0.584 0.416
Vowel 0.827 0.825 0.816 0.841 0.407 0.593
Although mixed CK distance is a Riemannian metric distance, it is
not of constant curvature.
26
Conclusion
27
Contributions and perspectives
Study of Cayley-Klein elliptic/hyperbolic geometries:
Affine bisector, Voronoi diagrams from (clipped) power
diagrams, Cayley-Klein balls (Mahalanobis shapes with
displaced centers), etc.
Classification with Large Margin Nearest Neighbor (LMNN) in
Cayley-Klein elliptic/hyperbolic geometries
(hyperbolic geometry: compact domain & unbounded
distance)
Experiments on mixed Cayley-Klein distances
Ongoing work:
Extensions of Cayley-Klein geometries to Machine Learning
28
Thank you!
https://www.lix.polytechnique.fr/~nielsen/CayleyKlein/
10.1109/ICIP.2016.7532355
arXiv:1609.07082
29
Yanhong Bi, Bin Fan, and Fuchao Wu.
Beyond Mahalanobis metric: Cayley-Klein metric learning.
In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
Jean-Daniel Boissonnat, Frank Nielsen, and Richard Nock.
Bregman Voronoi diagrams.
Discrete & Computational Geometry, 44(2):281–307, 2010.
C. Gunn.
Geometry, Kinematics, and Rigid Body Mechanics in Cayley-Klein Geometries.
PhD thesis, Technische UniversitÃďt Berlin, 2011.
Frank Nielsen, Boris Muzellec, and Richard Nock.
Classification with mixtures of curved Mahalanobis metrics.
In IEEE International Conference on Image Processing (ICIP), pages 241–245, Sept 2016.
Frank Nielsen and Richard Nock.
Hyperbolic Voronoi diagrams made easy.
In IEEE International Conference on Computational Science and Its Applications (ICCSA), pages
74–80, 2010.
Frank Nielsen, Paolo Piro, and Michel Barlaud.
Bregman vantage point trees for efficient nearest neighbor queries.
In 2009 IEEE International Conference on Multimedia and Expo, pages 878–881. IEEE, 2009.
Jürgen Richter-Gebert.
Perspectives on Projective Geometry: A Guided Tour Through Real and Complex Geometry.
Springer Publishing Company, Incorporated, 1st edition, 2011.
Kilian Q. Weinberger, John Blitzer, and Lawrence K. Saul.
Distance metric learning for large margin nearest neighbor classification.
In In NIPS. MIT Press, 2006.
30
Eric P. Xing, Andrew Y. Ng, Michael I. Jordan, and Stuart Russell.
Distance metric learning, with application to clustering with side-information.
In Advances in neural information processing systems*15, pages 505–512. MIT Press, 2003.
Peter N. Yianilos.
Data structures and algorithms for nearest neighbor search in general metric spaces.
In Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’93,
pages 311–321, Philadelphia, PA, USA, 1993. Society for Industrial and Applied Mathematics.
31
Overview
Review of Mahalanobis distances
Basics of Cayley-Klein geometry
Distance from cross-ratio measures
Distance expressions
Dual conics
Cayley-Klein distances as curved Malahanobis distances
Computational geometry in Cayley-Klein geometries
Learning curved Mahalanobis metrics
Large Margin Nearest Neighbors (LMNN)
Elliptic Cayley-Klein LMNN
Hyperbolic Cayley-Klein LMNN
Experimental results
Nearest-neighbor classification in Cayley-Klein geometries
Mixed curved Mahalanobis distance
Contributions and perspectives
Bibliography
Supplemental information
32
Properties of the cross-ratio
(p, p; P, P) = 1
(p, q; Q, P) = 1
(p,q;P,Q)
(p, q; P, Q) = (p, r; P, Q) · (r, q; P, Q) when r is collinear with
p, q, P, Q
P
p
q
Q
r
33
Measuring angles in Cayley-Klein geometries
angle(l, m) = cangle log((l, m; L, M))
where L and M are tangent lines to A passing through the
intersection point p (p = l × m in 2D) of l m.
F
m
l
p
M
L
34
Interpretation of hyperbolic Cayley-Klein distance
dH(x, y) = κ arccosh x , y
x y
x’ y’
35
Cayley-Klein Voronoi diagrams from (clipped)
power diagrams
ci =
Σpi + a
2 Spi pi
r2
i =
Σpi + a 2
4Spi pi
+
a pi + b
Spi pi
36
Cayley-Klein balls have Mahalanobis ball shapes
Elliptic Cayley-Klein ball case:
Σ = ˜r2
Σ − aa
c = Σ −1
(b a − ˜r2
a)
r 2
= b 2
− ˜r2
b + c , c Σ
with
˜r = Sc,c cos(r)
a = Σc + a
b = a c + b
37
Cayley-Klein balls have Mahalanobis ball shapes
Hyperbolic Cayley-Klein ball case:
Σ = aa − ˜r2
Σ
c = Σ −1
(˜r2
a − b a )
r 2
= ˜r2
b − b 2
+ c , c Σ
with
˜r = Sc,c cosh(r)
a = Σc + a
b = a c + b
... and drawing a Mahalanobis ball amounts to draw a Euclidean
ball after affine transformation x ← L x.
38
Spectral decomposition and signature
Eigenvalue decomposition: S = OΛO .
Λ = diag(Λ1,1, . . . , Λd+1,d+1)
Canonical decomposition: S = OD
1
2
I 0
0 λ
D
1
2 O , where
λ =∈ {−1, 1} and O= orthogonal matrix (O−1 = O )
Diagonal matrix D has all positive values, with Di,i = Λi,i and
Dd+1,d+1 = |Λd+1,d+1|
39

Contenu connexe

Tendances

Clustering in Hilbert simplex geometry
Clustering in Hilbert simplex geometryClustering in Hilbert simplex geometry
Clustering in Hilbert simplex geometryFrank Nielsen
 
Density theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsDensity theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsVjekoslavKovac1
 
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsA T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsVjekoslavKovac1
 
Clustering in Hilbert geometry for machine learning
Clustering in Hilbert geometry for machine learningClustering in Hilbert geometry for machine learning
Clustering in Hilbert geometry for machine learningFrank Nielsen
 
Density theorems for anisotropic point configurations
Density theorems for anisotropic point configurationsDensity theorems for anisotropic point configurations
Density theorems for anisotropic point configurationsVjekoslavKovac1
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeVjekoslavKovac1
 
Estimates for a class of non-standard bilinear multipliers
Estimates for a class of non-standard bilinear multipliersEstimates for a class of non-standard bilinear multipliers
Estimates for a class of non-standard bilinear multipliersVjekoslavKovac1
 
Multilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureMultilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureVjekoslavKovac1
 
A Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeA Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeVjekoslavKovac1
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsFrank Nielsen
 
Variants of the Christ-Kiselev lemma and an application to the maximal Fourie...
Variants of the Christ-Kiselev lemma and an application to the maximal Fourie...Variants of the Christ-Kiselev lemma and an application to the maximal Fourie...
Variants of the Christ-Kiselev lemma and an application to the maximal Fourie...VjekoslavKovac1
 
Proximal Splitting and Optimal Transport
Proximal Splitting and Optimal TransportProximal Splitting and Optimal Transport
Proximal Splitting and Optimal TransportGabriel Peyré
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodFrank Nielsen
 

Tendances (20)

Clustering in Hilbert simplex geometry
Clustering in Hilbert simplex geometryClustering in Hilbert simplex geometry
Clustering in Hilbert simplex geometry
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
 Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli... Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Appli...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Density theorems for Euclidean point configurations
Density theorems for Euclidean point configurationsDensity theorems for Euclidean point configurations
Density theorems for Euclidean point configurations
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operatorsA T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
A T(1)-type theorem for entangled multilinear Calderon-Zygmund operators
 
Clustering in Hilbert geometry for machine learning
Clustering in Hilbert geometry for machine learningClustering in Hilbert geometry for machine learning
Clustering in Hilbert geometry for machine learning
 
Density theorems for anisotropic point configurations
Density theorems for anisotropic point configurationsDensity theorems for anisotropic point configurations
Density theorems for anisotropic point configurations
 
A Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cubeA Szemeredi-type theorem for subsets of the unit cube
A Szemeredi-type theorem for subsets of the unit cube
 
Estimates for a class of non-standard bilinear multipliers
Estimates for a class of non-standard bilinear multipliersEstimates for a class of non-standard bilinear multipliers
Estimates for a class of non-standard bilinear multipliers
 
Multilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structureMultilinear singular integrals with entangled structure
Multilinear singular integrals with entangled structure
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
A Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cubeA Szemerédi-type theorem for subsets of the unit cube
A Szemerédi-type theorem for subsets of the unit cube
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors
 
Variants of the Christ-Kiselev lemma and an application to the maximal Fourie...
Variants of the Christ-Kiselev lemma and an application to the maximal Fourie...Variants of the Christ-Kiselev lemma and an application to the maximal Fourie...
Variants of the Christ-Kiselev lemma and an application to the maximal Fourie...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Proximal Splitting and Optimal Transport
Proximal Splitting and Optimal TransportProximal Splitting and Optimal Transport
Proximal Splitting and Optimal Transport
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 

En vedette

Traitement des données massives (INF442, A6)
Traitement des données massives (INF442, A6)Traitement des données massives (INF442, A6)
Traitement des données massives (INF442, A6)Frank Nielsen
 
Traitement des données massives (INF442, A2)
Traitement des données massives (INF442, A2)Traitement des données massives (INF442, A2)
Traitement des données massives (INF442, A2)Frank Nielsen
 
(ISIA 5) Cours d'algorithmique (1995)
(ISIA 5) Cours d'algorithmique (1995)(ISIA 5) Cours d'algorithmique (1995)
(ISIA 5) Cours d'algorithmique (1995)Frank Nielsen
 
Traitement des données massives (INF442, A3)
Traitement des données massives (INF442, A3)Traitement des données massives (INF442, A3)
Traitement des données massives (INF442, A3)Frank Nielsen
 
Traitement massif des données 2016
Traitement massif des données 2016Traitement massif des données 2016
Traitement massif des données 2016Frank Nielsen
 
On representing spherical videos (Frank Nielsen, CVPR 2001)
On representing spherical videos (Frank Nielsen, CVPR 2001)On representing spherical videos (Frank Nielsen, CVPR 2001)
On representing spherical videos (Frank Nielsen, CVPR 2001)Frank Nielsen
 
Traitement des données massives (INF442, A5)
Traitement des données massives (INF442, A5)Traitement des données massives (INF442, A5)
Traitement des données massives (INF442, A5)Frank Nielsen
 
Traitement des données massives (INF442, A1)
Traitement des données massives (INF442, A1)Traitement des données massives (INF442, A1)
Traitement des données massives (INF442, A1)Frank Nielsen
 
Traitement des données massives (INF442, A7)
Traitement des données massives (INF442, A7)Traitement des données massives (INF442, A7)
Traitement des données massives (INF442, A7)Frank Nielsen
 
Computational Information Geometry for Machine Learning
Computational Information Geometry for Machine LearningComputational Information Geometry for Machine Learning
Computational Information Geometry for Machine LearningFrank Nielsen
 

En vedette (10)

Traitement des données massives (INF442, A6)
Traitement des données massives (INF442, A6)Traitement des données massives (INF442, A6)
Traitement des données massives (INF442, A6)
 
Traitement des données massives (INF442, A2)
Traitement des données massives (INF442, A2)Traitement des données massives (INF442, A2)
Traitement des données massives (INF442, A2)
 
(ISIA 5) Cours d'algorithmique (1995)
(ISIA 5) Cours d'algorithmique (1995)(ISIA 5) Cours d'algorithmique (1995)
(ISIA 5) Cours d'algorithmique (1995)
 
Traitement des données massives (INF442, A3)
Traitement des données massives (INF442, A3)Traitement des données massives (INF442, A3)
Traitement des données massives (INF442, A3)
 
Traitement massif des données 2016
Traitement massif des données 2016Traitement massif des données 2016
Traitement massif des données 2016
 
On representing spherical videos (Frank Nielsen, CVPR 2001)
On representing spherical videos (Frank Nielsen, CVPR 2001)On representing spherical videos (Frank Nielsen, CVPR 2001)
On representing spherical videos (Frank Nielsen, CVPR 2001)
 
Traitement des données massives (INF442, A5)
Traitement des données massives (INF442, A5)Traitement des données massives (INF442, A5)
Traitement des données massives (INF442, A5)
 
Traitement des données massives (INF442, A1)
Traitement des données massives (INF442, A1)Traitement des données massives (INF442, A1)
Traitement des données massives (INF442, A1)
 
Traitement des données massives (INF442, A7)
Traitement des données massives (INF442, A7)Traitement des données massives (INF442, A7)
Traitement des données massives (INF442, A7)
 
Computational Information Geometry for Machine Learning
Computational Information Geometry for Machine LearningComputational Information Geometry for Machine Learning
Computational Information Geometry for Machine Learning
 

Similaire à Classification with mixtures of curved Mahalanobis metrics

SMB_2012_HR_VAN_ST-last version
SMB_2012_HR_VAN_ST-last versionSMB_2012_HR_VAN_ST-last version
SMB_2012_HR_VAN_ST-last versionLilyana Vankova
 
Variations on the method of Coleman-Chabauty
Variations on the method of Coleman-ChabautyVariations on the method of Coleman-Chabauty
Variations on the method of Coleman-Chabautymmasdeu
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfAlexander Litvinenko
 
Different kind of distance and Statistical Distance
Different kind of distance and Statistical DistanceDifferent kind of distance and Statistical Distance
Different kind of distance and Statistical DistanceKhulna University
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko
 
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 On Clustering Histograms with k-Means by Using Mixed α-Divergences On Clustering Histograms with k-Means by Using Mixed α-Divergences
On Clustering Histograms with k-Means by Using Mixed α-DivergencesFrank Nielsen
 
Rosser's theorem
Rosser's theoremRosser's theorem
Rosser's theoremWathna
 
Minimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian updateMinimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian updateAlexander Litvinenko
 
Trilinear embedding for divergence-form operators
Trilinear embedding for divergence-form operatorsTrilinear embedding for divergence-form operators
Trilinear embedding for divergence-form operatorsVjekoslavKovac1
 
Solution to schrodinger equation with dirac comb potential
Solution to schrodinger equation with dirac comb potential Solution to schrodinger equation with dirac comb potential
Solution to schrodinger equation with dirac comb potential slides
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixturesChristian Robert
 
A Proof of the Generalized Riemann Hypothesis
A Proof of the Generalized Riemann HypothesisA Proof of the Generalized Riemann Hypothesis
A Proof of the Generalized Riemann HypothesisCharaf Ech-Chatbi
 
A Proof of the Generalized Riemann Hypothesis
A Proof of the Generalized Riemann HypothesisA Proof of the Generalized Riemann Hypothesis
A Proof of the Generalized Riemann HypothesisCharaf Ech-Chatbi
 

Similaire à Classification with mixtures of curved Mahalanobis metrics (20)

SMB_2012_HR_VAN_ST-last version
SMB_2012_HR_VAN_ST-last versionSMB_2012_HR_VAN_ST-last version
SMB_2012_HR_VAN_ST-last version
 
Igv2008
Igv2008Igv2008
Igv2008
 
Variations on the method of Coleman-Chabauty
Variations on the method of Coleman-ChabautyVariations on the method of Coleman-Chabauty
Variations on the method of Coleman-Chabauty
 
QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...
QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...
QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdf
 
Different kind of distance and Statistical Distance
Different kind of distance and Statistical DistanceDifferent kind of distance and Statistical Distance
Different kind of distance and Statistical Distance
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 On Clustering Histograms with k-Means by Using Mixed α-Divergences On Clustering Histograms with k-Means by Using Mixed α-Divergences
On Clustering Histograms with k-Means by Using Mixed α-Divergences
 
Rosser's theorem
Rosser's theoremRosser's theorem
Rosser's theorem
 
Lecture6 svdd
Lecture6 svddLecture6 svdd
Lecture6 svdd
 
Minimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian updateMinimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian update
 
Trilinear embedding for divergence-form operators
Trilinear embedding for divergence-form operatorsTrilinear embedding for divergence-form operators
Trilinear embedding for divergence-form operators
 
cswiercz-general-presentation
cswiercz-general-presentationcswiercz-general-presentation
cswiercz-general-presentation
 
Ch02 5
Ch02 5Ch02 5
Ch02 5
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Solution to schrodinger equation with dirac comb potential
Solution to schrodinger equation with dirac comb potential Solution to schrodinger equation with dirac comb potential
Solution to schrodinger equation with dirac comb potential
 
Muchtadi
MuchtadiMuchtadi
Muchtadi
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
A Proof of the Generalized Riemann Hypothesis
A Proof of the Generalized Riemann HypothesisA Proof of the Generalized Riemann Hypothesis
A Proof of the Generalized Riemann Hypothesis
 
A Proof of the Generalized Riemann Hypothesis
A Proof of the Generalized Riemann HypothesisA Proof of the Generalized Riemann Hypothesis
A Proof of the Generalized Riemann Hypothesis
 

Dernier

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 

Dernier (20)

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 

Classification with mixtures of curved Mahalanobis metrics

  • 1. Classification with mixtures of curved Mahalanobis metrics — or LMNN in Cayley-Klein geometries — arXiv:1609.07082 Frank Nielsen1,2 Boris Muzellec1 Richard Nock3,4,5 1Ecole Polytechnique, France 2Sony CSL, Japan 3Data61, Australia 4ANU, Australia 4The University of Sydney, Australia 26th September 2016 1
  • 2. Mahalanobis distances For Q 0, a symmetric positive definite matrix like a covariance matrix, define Mahalanobis distance: DQ(p, q) = (p − q) Q(p − q) Metric distance (indiscernibles/symmetry/triangle inequality) Eg., Q = precision matrix Σ−1, where Σ = covariance matrix Generalize Euclidean distance when Q = I: DI (p, q) = p − q Mahalanobis distance interpreted as Euclidean distance after Cholesky decomposition Q = L L and affine transformation x ← L x: DQ(p, q) = DI (L p, L q) = p − q 2
  • 3. Generalizing Mahalanobis distances with Cayley-Klein projective geometries + Learning in Cayley-Klein spaces 3
  • 4. Cayley-Klein geometry: Projective geometry [7, 3] RPd : (λx, λ) ∼ (x, 1) homogeneous coordinates x → ˜x = (x, w = 1), and dehomogeneization by “perspective division” ˜x → x w cross-ratio measure is invariant by projectivity/homography/collineation: (p, q; P, Q) = |p − P||q − Q| |p − Q||q − P| where p, q, P, Q are collinear P p q Q 4
  • 5. Definition of Cayley-Klein geometries A Cayley-Klein geometry is K = (F, cdist, cangle): 1. A fundamental conic: F 2. A constant unit cdist ∈ C for measuring distances 3. A constant unit cangle ∈ C for measuring angles See monograph [7] 5
  • 6. Distance in Cayley-Klein geometries dist(p, q) = cdist Log((p, q; P, Q)) where P and Q are intersection points of line l = (pq) (˜l = ˜p × ˜q in 2D) with the conic. Log is principal complex logarithm (modulo 2πi) p q Q P F l 6
  • 7. Key properties of Cayley-Klein distances dist(p, p) = 0 (law of indiscernibles) Signed distances : dist(p, q) = −dist(q, p) When p, q, r are collinear dist(p, q) = dist(p, r) + dist(r, q) Geodesics in Cayley-Klein geometries are straight lines (eventually clipped within the conic domain) Logarithm is transferring multiplicative properties of the cross-ratio to additive properties of Cayley-Klein distances. When p, q, P, Q are collinear: (p, q; P, Q) = (p, r; P, Q) · (r, q; P, Q) 7
  • 8. Dual conics In projective geometry, points and lines are dual concepts Dual parameterizations of the fundamental conic F = (A, A∆) Quadratic form QA(x) = ˜x A˜x primal conic = set of border points: CA = {˜p : QA(˜p) = 0} dual conic = set of tangent hyperplanes: C∗ A = {˜l : QA∆ (˜l) = 0} A∆ = A−1|A| is the adjoint matrix Adjoint can be computed even when A is not invertible (|A| = 0) 8
  • 9. Taxonomy Signature of matrix = sign of eigenvalues of its eigen decomposition Type A A∆ Conic Elliptic (+, +, +) (+, +, +) non-degenerate complex conic Hyperbolic (+, +, −) (+, +, −) non-degenerate real conic Dual Euclidean (+, +, 0) (+, +, 0) Two complex lines with a real intersection point Dual Pseudo-euclidean (+, −, 0) (+, 0, 0) Two real lines with a double real intersection point Deux Euclidean (+, 0, 0) (+, +, 0) Two complex points with a double real line passing through Pseudo-euclidean (+, 0, 0) (+, −, 0) Two complex points with a double real line passing through Galilean (+, 0, 0) (+, 0, 0) Double real line with a real intersection point Degenerate cases are obtained as limit of non-degenerate cases. Measurements can be elliptic, hyperbolic or parabolic (degenerate case). 9
  • 10. Real CK distances without cross-ratio expressions For real Cayley-Klein measures, we choose the constants: Constants (κ is curvature): Elliptic (κ > 0): cdist = κ 2i Hyperbolic (κ < 0): cdist = −κ 2 Bilinear form Spq = (p , 1) S(q, 1) = ˜p S ˜q Get rid of cross-ratio using: (p, q; P, Q) = Spq + S2 pq − SppSqq Spq − S2 pq − SppSqq 10
  • 11. Elliptic Cayley-Klein metric distance dE (p, q) = κ 2i Log   Spq + S2 pq − SppSqq Spq − S2 pq − SppSqq   dE (p, q) = κ arccos Spq SppSqq Notice that dE (p, q) < κπ, domain DS = Rd in elliptic case. yx y’x’ Gnomonic projection dE (x, y) = κ · arccos ( x , y ) 11
  • 12. Hyperbolic Cayley-Klein distance When p, q ∈ DS := {p : Spp < 0}, the hyperbolic domain: dH(p, q) = − κ 2 log   Spq + S2 pq − SppSqq Spq − S2 pq − SppSqq   dH(p, q) = −κ arctanh 1 − SppSqq S2 pq dH(p, q) = −κ arccosh Spq SppSqq with arccosh(x) = log(x + √ x2 − 1) and arctanh(x) = 1 2 log 1+x 1−x . Curvature κ < 0 12
  • 13. Decomposition of the bilinear form [1] Write S = Σ a a b = SΣ,a,b with Σ 0. Sp,q = ˜p S ˜q = p Σq + p a + a q + b Let µ = −Σ−1a ∈ Rd (a = −Σµ) and b = µ Σµ + sign(κ) 1 κ2 κ = (b − µ µ)−1 2 b > µ µ −(µ µ − b)−1 2 b < µ µ Then the bilinear form writes as: S(p, q) = SΣ,µ,κ(p, q) = (p − µ) Σ(q − µ) + sign(κ) 1 κ2 13
  • 14. Curved Mahalanobis metric distances We have [1]: lim κ→0+ DΣ,µ,κ(p, q) = lim κ→0− DΣ,µ,κ(p, q) = DΣ(p, q) Mahalanobis distance DΣ(p, q) = DΣ,0,0(p, q) Thus hyperbolic/elliptic Cayley-Klein distances can be interpreted as curved Mahalanobis distances, or κ-Mahalanobis distances When S = diag(1, 1, ..., 1, −1), we recover the canonical hyperbolic distance [5] in Cayley-Klein model: Dh(p, q) = arccosh 1 − p, q 1 − p, p 1 − q, q defined inside the interior of a unit ball. 14
  • 15. Cayley-Klein bisectors are affine Bisector Bi(p, q): Bi(p, q) = {x ∈ DS : distS (p, x) = distS (x, q)} S(p, x) S(p, p) = S(q, x) S(q, q) arccos and arccosh are monotonically increasing functions. x, |S(p, p)|Σq − |S(q, q)|Σp + |S(p, p)|(a (q + x) + b) − |S(q, q)|(a (p + x) + b) = 0 Hyperplanes (restricted to the domain) 15
  • 16. Cayley-Klein Voronoi diagrams are affine Can be computed from equivalent (clipped) power diagrams [2, 5] https://www.youtube.com/watch?v=YHJLq3-RL58 16
  • 17. Cayley-Klein balls Blue: Mahalanobis Red: elliptic Green: Hyperbolic Cayley-Klein balls have Mahalanobis ball shapes with displaced centers 17
  • 19. Large Margin Nearest Neighbors [8], LMNN Learn Mahalanobis distance M = L L 0 for a given input data-set P Distance of each point to its target neighbors shrink, pull(L) S = {(xi , xj ) : yi = yj and xj ∈ N(xj )} Keep a distance margin of each point to its impostors, push(L) R = {(xi , xj , xl ) : (xi , xj ) ∈ S and yi = yl } http://www.cs.cornell.edu/~kilian/code/lmnn/lmnn.html 19
  • 20. LMNN: Cost function and optimization Objective cost function [8]: convex and piecewise linear (SDP) pull(L) = Σi,i→j L(xi − xj ) 2 , push(L) = Σi,i→j Σj (1 − yil ) 1 + L(xi − xj ) 2 − L(xi − xl ) 2 + , (L) = (1 − µ) pull(L) + µ push(L) i → j: xj is a target neighbor of xi yil = 1 iff xi and xj have same label, yil = 0 otherwise. µ set by cross-validation Optimize by gradient descent: (Lt+1) = (Lt) − γ ∂ (Lt ) ∂L ∂ ∂L = (1 − µ)Σi,i→j Cij + µΣ(i,j,l)∈Rt (Cij − Cil ) where Cij = (xi − xj ) (xi − xj ) Easy, no projection mechanism like for Mahalanobis Metric for Clustering (MMC) [9] 20
  • 21. Elliptic Cayley-Klein LMNN [1], CVPR 2015 (L) = (1 − µ) i,i→j dE (xi , xj ) + µ i,i→j l (1 − yil )ζijl with ζijl = [1 + dE (xi , xj ) − dE (xi , xl )]+ (hinge loss) ∂ (L) ∂L = (1 − µ) i,i→j ∂dE (xi , xj ) ∂L + µ i,i→j l (1 − yil ) ∂ζijl ∂L Cij = (xi , 1) (xj , 1) ∂dE (xi , xj ) ∂L = k Sii Sjj − S2 ij L Sij Sii Cii + Sij Sjj Cjj − (Cij + Cji ) ∂ζijl ∂L = ∂dE (xi ,xj ) ∂L − ∂dE (xi ,xl ) ∂L , if ζijl ≥ 0, 0, otherwise. 21
  • 22. Hyperbolic Cayley-Klein LMNN (new case) To ensure S keeps correct signature (1, d, 0) during the LMNN gradient descent, we decompose S = L DL (with L 0) and perform a gradient descent on L with the following gradient: ∂dH(xi , xj ) ∂L = k S2 ij − Sii Sjj DL Sij Sii Cii + Sij Sjj Cjj − (Cij + Cji ) Recall two difficulties of hyperbolic case compared to elliptic case: Hyperbolic Cayley-Klein distance may be very large (unbounded vs. < κπ for elliptic case) Data-set should be contained inside the compact domain DS 22
  • 23. HCK-LMNN: Initialization and learning rate Initialize L = L 1 and D so that P ∈ DS with Σ−1 = L L (eg., precision matrix of P). D =      −1 ... −1 κ maxx L x 2      with κ > 1. At iteration t, it may happen that P ∈ DSt since we do not know the optimal learning rate γ. When this happens, we reduce γ ← γ 2 , otherwise we let γ ← 1.01γ. 23
  • 24. Curved Mahalanobis learning: Results Experimental results on some UCI data-sets k Data-set elliptic Hyperbolic Mahalanobis 1 wine 0.989 0.865 0.984 vowel 0.832 0.797 0.827 balance 0.924 0.891 0.846 pima 0.726 0.706 0.709 3 wine 0.983 0.871 0.984 vowel 0.828 0.782 0.827 balance 0.917 0.911 0.846 pima 0.706 0.695 0.709 5 wine 0.983 0.984 vowel 0.826 0.805 0.827 balance 0.907 0.895 0.846 pima 0.714 0.712 0.709 11 wine 0.994 0.983 0.984 vowel 0.839 0.767 0.827 balance 0.874 0.897 0.846 pima 0.713 0.698 0.709 For classification, enough to consider κ ∈ {−1, 0, +1} 24
  • 25. Spectral decomposition and fast proximity queries Avoid to compute dE or dH for arbitrary S Apply spectral decomposition (elliptic case S = L L, or hyperbolic case S = L DL ) and perform coordinate changes so that we consider the canonical metric distances: dE (x , y ) = arccos x , y x y , dH(x , y ) = arccosh 1 − x , y 1 − x , x 1 − y , y Proximity query: Eg, Vantage Point Tree data-structures [10, 6] (with metric pruning). 25
  • 26. Mixed curved Mahalanobis distance d(x, y) = αdE (x, y) + (1 − α)dH(x, y) 1. Sum of Riemannian metric distances is metric (“blending” positive with negative constant curvatures) 2. Mixed of bounded distance (elliptic CK) with unbounded distance (hyperbolic CK), hyperparameter tuning α Datasets Mahalanobis elliptic Hyperbolic Mixed α β = (1 − α) Wine 0.993 0.984 0.893 0.986 0.741 0.259 Sonar 0.733 0.788 0.640 0.802 0.794 0.206 Balance 0.846 0.910 0.904 0.920 0.440 0.560 Pima 0.709 0.712 0.699 0.720 0.584 0.416 Vowel 0.827 0.825 0.816 0.841 0.407 0.593 Although mixed CK distance is a Riemannian metric distance, it is not of constant curvature. 26
  • 28. Contributions and perspectives Study of Cayley-Klein elliptic/hyperbolic geometries: Affine bisector, Voronoi diagrams from (clipped) power diagrams, Cayley-Klein balls (Mahalanobis shapes with displaced centers), etc. Classification with Large Margin Nearest Neighbor (LMNN) in Cayley-Klein elliptic/hyperbolic geometries (hyperbolic geometry: compact domain & unbounded distance) Experiments on mixed Cayley-Klein distances Ongoing work: Extensions of Cayley-Klein geometries to Machine Learning 28
  • 30. Yanhong Bi, Bin Fan, and Fuchao Wu. Beyond Mahalanobis metric: Cayley-Klein metric learning. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015. Jean-Daniel Boissonnat, Frank Nielsen, and Richard Nock. Bregman Voronoi diagrams. Discrete & Computational Geometry, 44(2):281–307, 2010. C. Gunn. Geometry, Kinematics, and Rigid Body Mechanics in Cayley-Klein Geometries. PhD thesis, Technische UniversitÃďt Berlin, 2011. Frank Nielsen, Boris Muzellec, and Richard Nock. Classification with mixtures of curved Mahalanobis metrics. In IEEE International Conference on Image Processing (ICIP), pages 241–245, Sept 2016. Frank Nielsen and Richard Nock. Hyperbolic Voronoi diagrams made easy. In IEEE International Conference on Computational Science and Its Applications (ICCSA), pages 74–80, 2010. Frank Nielsen, Paolo Piro, and Michel Barlaud. Bregman vantage point trees for efficient nearest neighbor queries. In 2009 IEEE International Conference on Multimedia and Expo, pages 878–881. IEEE, 2009. Jürgen Richter-Gebert. Perspectives on Projective Geometry: A Guided Tour Through Real and Complex Geometry. Springer Publishing Company, Incorporated, 1st edition, 2011. Kilian Q. Weinberger, John Blitzer, and Lawrence K. Saul. Distance metric learning for large margin nearest neighbor classification. In In NIPS. MIT Press, 2006. 30
  • 31. Eric P. Xing, Andrew Y. Ng, Michael I. Jordan, and Stuart Russell. Distance metric learning, with application to clustering with side-information. In Advances in neural information processing systems*15, pages 505–512. MIT Press, 2003. Peter N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In Proceedings of the Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’93, pages 311–321, Philadelphia, PA, USA, 1993. Society for Industrial and Applied Mathematics. 31
  • 32. Overview Review of Mahalanobis distances Basics of Cayley-Klein geometry Distance from cross-ratio measures Distance expressions Dual conics Cayley-Klein distances as curved Malahanobis distances Computational geometry in Cayley-Klein geometries Learning curved Mahalanobis metrics Large Margin Nearest Neighbors (LMNN) Elliptic Cayley-Klein LMNN Hyperbolic Cayley-Klein LMNN Experimental results Nearest-neighbor classification in Cayley-Klein geometries Mixed curved Mahalanobis distance Contributions and perspectives Bibliography Supplemental information 32
  • 33. Properties of the cross-ratio (p, p; P, P) = 1 (p, q; Q, P) = 1 (p,q;P,Q) (p, q; P, Q) = (p, r; P, Q) · (r, q; P, Q) when r is collinear with p, q, P, Q P p q Q r 33
  • 34. Measuring angles in Cayley-Klein geometries angle(l, m) = cangle log((l, m; L, M)) where L and M are tangent lines to A passing through the intersection point p (p = l × m in 2D) of l m. F m l p M L 34
  • 35. Interpretation of hyperbolic Cayley-Klein distance dH(x, y) = κ arccosh x , y x y x’ y’ 35
  • 36. Cayley-Klein Voronoi diagrams from (clipped) power diagrams ci = Σpi + a 2 Spi pi r2 i = Σpi + a 2 4Spi pi + a pi + b Spi pi 36
  • 37. Cayley-Klein balls have Mahalanobis ball shapes Elliptic Cayley-Klein ball case: Σ = ˜r2 Σ − aa c = Σ −1 (b a − ˜r2 a) r 2 = b 2 − ˜r2 b + c , c Σ with ˜r = Sc,c cos(r) a = Σc + a b = a c + b 37
  • 38. Cayley-Klein balls have Mahalanobis ball shapes Hyperbolic Cayley-Klein ball case: Σ = aa − ˜r2 Σ c = Σ −1 (˜r2 a − b a ) r 2 = ˜r2 b − b 2 + c , c Σ with ˜r = Sc,c cosh(r) a = Σc + a b = a c + b ... and drawing a Mahalanobis ball amounts to draw a Euclidean ball after affine transformation x ← L x. 38
  • 39. Spectral decomposition and signature Eigenvalue decomposition: S = OΛO . Λ = diag(Λ1,1, . . . , Λd+1,d+1) Canonical decomposition: S = OD 1 2 I 0 0 λ D 1 2 O , where λ =∈ {−1, 1} and O= orthogonal matrix (O−1 = O ) Diagonal matrix D has all positive values, with Di,i = Λi,i and Dd+1,d+1 = |Λd+1,d+1| 39