Lecture10 outilier l0_svdd

•

0 j'aime•766 vues

This lecture discusses robust outlier detection using L0-SVDD (support vector data description). It presents the SVDD method for outlier detection and its limitations. It then introduces the L0 norm as an approximation of the L0 norm to make SVDD robust to outliers. The algorithm uses DC (difference of convex functions) programming to iteratively solve an adaptive version of SVDD, building up a sequence of solutions. At each iteration, it solves the dual quadratic program to obtain the center c and radius R, and updates weights for the next iteration. This provides a robust outlier detection method using L0-SVDD.

Lecture 10: Robust outlier detection with L0-SVDD
Stéphane Canu
stephane.canu@litislab.eu
Sao Paulo 2014
February 28, 2014

Roadmap
1 Robust outlier detection with L0-SVDD
L0 SVDD
4 iterations of Adaptive L0 SVDD

Recall SVDD



min
R,c,ξ
R + C
n
i=1
ξi
with xi − c 2 ≤ R + ξi , i = 1, . . . , n
and ξi ≥ 0, i = 1, . . . , n
(1)
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 3 / 11

SVDD + outlier
C =1/16 C =1/8 C =1/4 C = 1/2 ( )
Figure: Example of SVDD solutions with diﬀerent C values, m = 0 (red) and
m = 5 (magenta). The circled data points represent support vectors for both m.
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 4 / 11

The L0 norm
ξ 0 ≤ t



min
c∈IRp
,R∈IR,ξ∈IRn
R + C ξ 0
with xi − c 2 ≤ R+ξi
ξi ≥ 0 i = 1, n
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 5 / 11

L0 relaxations
p norm
exponenetial
piecewise linear
log



min
c∈IRp
,R∈IR,ξ∈IRn
R + C
n
i=1
log(γ + ξi )
with xi − c 2 ≤ R+ξi
ξi ≥ 0 i = 1, n .
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 6 / 11

DC programing
log(γ + t) = f (t) − g(t) with f (t) = t and g(t) = t − log(γ + t),
both functions f and g being convex. The DC framework consists in
minimizing iteratively (R plus a sum of) the following convex term:
f (ξ) − g (ξ)ξ = ξ − 1 −
1
γ + ξold
ξ =
ξ
γ + ξold
,
where ξold
i denotes the solution at the previous iteration.
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 7 / 11

The DC idea applied to our L0 SVDD approximation consists in building a
sequence of solutions of the following adaptive SVDD:



min
c∈IRp
,R∈IR,ξ∈IRn
R + C
n
i=1
wi ξi
with xi − c 2 ≤ R+ξi
ξi ≥ 0 i = 1, n
with wi =
1
γ + ξold
i
.
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 8 / 11

Stationary conditions of the KKT give: c = n
i=1 αi xi and n
i=1 αi = 1
where the αi are the Lagrange multipliers associated with the inequality
constraints xi − c 2 ≤ R+ξi . The dual of this problem is
min
α∈IRn
α XX α − α diag(XX )
with n
i=1 αi = 1 0 ≤ αi ≤ Cwi i = 1, n
(2)
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 9 / 11

Algorithm 1 L0 SVDD for the linear kernel
Data: X, y, C , γ
Result: R , c, ξ , α
wi = 1; i = 1, n
while not converged do
(α, λ) ← solve_QP(X, C, w) % solve problem (2)
c ← X α
R ← λ + c c
ξi ← max(0, xi − c 2 − R) i = 1, n
wi ← 1/(γ + ξi ) i = 1, n
end
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 10 / 11

Bibliography
Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 11 / 11

Contenu connexe

Tendances

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...The Statistical and Applied Mathematical Sciences Institute

Lecture 2: linear SVM in the dualStéphane Canu

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...The Statistical and Applied Mathematical Sciences Institute

Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko

Introduction to inverse problemsDelta Pi Systems

Lecture8 multi class_svmStéphane Canu

Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Beniamino Murgante

Mesh Processing Course : Active ContoursGabriel Peyré

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...The Statistical and Applied Mathematical Sciences Institute

Deflection 2anashalim

Andreas EberleBigMC

Patch Matching with Polynomial Exponential Families and Projective DivergencesFrank Nielsen

Low Complexity Regularization of Inverse ProblemsGabriel Peyré

Learning Sparse RepresentationGabriel Peyré

Classification with mixtures of curved Mahalanobis metricsFrank Nielsen

Proximal Splitting and Optimal TransportGabriel Peyré

From planar maps to spatial topology change in 2d gravityTimothy Budd

QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...The Statistical and Applied Mathematical Sciences Institute

Treewidth and ApplicationsASPAK2014

The dual geometry of Shannon informationFrank Nielsen

Tendances (20)

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...

Lecture 2: linear SVM in the dual

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...

Hierarchical matrices for approximating large covariance matries and computin...

Introduction to inverse problems

Lecture8 multi class_svm

Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...

Mesh Processing Course : Active Contours

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...

Deflection 2

Andreas Eberle

Patch Matching with Polynomial Exponential Families and Projective Divergences

Low Complexity Regularization of Inverse Problems

Learning Sparse Representation

Classification with mixtures of curved Mahalanobis metrics

Proximal Splitting and Optimal Transport

From planar maps to spatial topology change in 2d gravity

QMC: Operator Splitting Workshop, A New (More Intuitive?) Interpretation of I...

Treewidth and Applications

The dual geometry of Shannon information

Similaire à Lecture10 outilier l0_svdd

Micro to macro passage in traffic models including multi-anticipation effectGuillaume Costeseque

WISS 2015 - Machine Learning lecture by Ludovic Samper Antidot

A multi-objective optimization framework for a second order traffic flow mode...Guillaume Costeseque

MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB

Road junction modeling using a scheme based on Hamilton-Jacobi equationsGuillaume Costeseque

StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfAnna Carbone

Triggering patterns of topology changes in dynamic attributed graphsINSA Lyon - L'Institut National des Sciences Appliquées de Lyon

NIPS paper review 2014: A Differential Equation for Modeling Nesterov’s Accel...Kai-Wen Zhao

Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresAnmol Dwivedi

Slides ensae-2016-9Arthur Charpentier

SVM for Regressiontrieuminhtien

QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...The Statistical and Applied Mathematical Sciences Institute

NTU_papercoty huang

litvinenko_Intrusion_Bari_2023.pdfAlexander Litvinenko

Digital Signal Processing[ECEG-3171]-Ch1_L02Rediet Moges

Talk iccf 19_ben_hammoudaChiheb Ben Hammouda

Jan Picek, Martin Schindler, Jan Kyselý, Romana Beranová: Statistical aspects...Jiří Šmída

1EasyStudy3

Low-rank tensor methods for stochastic forward and inverse problemsAlexander Litvinenko

reportSirui Zhang

Similaire à Lecture10 outilier l0_svdd (20)

Micro to macro passage in traffic models including multi-anticipation effect

WISS 2015 - Machine Learning lecture by Ludovic Samper

A multi-objective optimization framework for a second order traffic flow mode...

MVPA with SpaceNet: sparse structured priors

Road junction modeling using a scheme based on Hamilton-Jacobi equations

StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf

Triggering patterns of topology changes in dynamic attributed graphs

NIPS paper review 2014: A Differential Equation for Modeling Nesterov’s Accel...

Linear Discriminant Analysis (LDA) Under f-Divergence Measures

Slides ensae-2016-9

SVM for Regression

QMC: Transition Workshop - Density Estimation by Randomized Quasi-Monte Carlo...

NTU_paper

litvinenko_Intrusion_Bari_2023.pdf

Digital Signal Processing[ECEG-3171]-Ch1_L02

Talk iccf 19_ben_hammouda

Jan Picek, Martin Schindler, Jan Kyselý, Romana Beranová: Statistical aspects...

Low-rank tensor methods for stochastic forward and inverse problems

report

Lecture10 outilier l0_svdd

1. Lecture 10: Robust outlier detection with L0-SVDD Stéphane Canu stephane.canu@litislab.eu Sao Paulo 2014 February 28, 2014

2. Roadmap 1 Robust outlier detection with L0-SVDD L0 SVDD 4 iterations of Adaptive L0 SVDD

3. Recall SVDD    min R,c,ξ R + C n i=1 ξi with xi − c 2 ≤ R + ξi , i = 1, . . . , n and ξi ≥ 0, i = 1, . . . , n (1) Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 3 / 11

4. SVDD + outlier C =1/16 C =1/8 C =1/4 C = 1/2 ( ) Figure: Example of SVDD solutions with diﬀerent C values, m = 0 (red) and m = 5 (magenta). The circled data points represent support vectors for both m. Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 4 / 11

5. The L0 norm ξ 0 ≤ t    min c∈IRp ,R∈IR,ξ∈IRn R + C ξ 0 with xi − c 2 ≤ R+ξi ξi ≥ 0 i = 1, n Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 5 / 11

6. L0 relaxations p norm exponenetial piecewise linear log    min c∈IRp ,R∈IR,ξ∈IRn R + C n i=1 log(γ + ξi ) with xi − c 2 ≤ R+ξi ξi ≥ 0 i = 1, n . Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 6 / 11

7. DC programing log(γ + t) = f (t) − g(t) with f (t) = t and g(t) = t − log(γ + t), both functions f and g being convex. The DC framework consists in minimizing iteratively (R plus a sum of) the following convex term: f (ξ) − g (ξ)ξ = ξ − 1 − 1 γ + ξold ξ = ξ γ + ξold , where ξold i denotes the solution at the previous iteration. Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 7 / 11

8. The DC idea applied to our L0 SVDD approximation consists in building a sequence of solutions of the following adaptive SVDD:    min c∈IRp ,R∈IR,ξ∈IRn R + C n i=1 wi ξi with xi − c 2 ≤ R+ξi ξi ≥ 0 i = 1, n with wi = 1 γ + ξold i . Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 8 / 11

9. Stationary conditions of the KKT give: c = n i=1 αi xi and n i=1 αi = 1 where the αi are the Lagrange multipliers associated with the inequality constraints xi − c 2 ≤ R+ξi . The dual of this problem is min α∈IRn α XX α − α diag(XX ) with n i=1 αi = 1 0 ≤ αi ≤ Cwi i = 1, n (2) Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 9 / 11

10. Algorithm 1 L0 SVDD for the linear kernel Data: X, y, C , γ Result: R , c, ξ , α wi = 1; i = 1, n while not converged do (α, λ) ← solve_QP(X, C, w) % solve problem (2) c ← X α R ← λ + c c ξi ← max(0, xi − c 2 − R) i = 1, n wi ← 1/(γ + ξi ) i = 1, n end Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 10 / 11

11. Bibliography Stéphane Canu (INSA Rouen - LITIS) February 28, 2014 11 / 11

Lecture10 outilier l0_svdd

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Lecture10 outilier l0_svdd

Similaire à Lecture10 outilier l0_svdd (20)

Lecture10 outilier l0_svdd