SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Seminar 2
Kernels
and
Support Vector Machines
Edgar Marca
Supervisor: DSc. André M.S. Barreto
Petrópolis, Rio de Janeiro - Brazil
September 2nd, 2015
1 / 28
Kernels
Kernels
Why Kernalize?
At first sight, introducing k(x, x′) has not improved our situation.
Instead of calculating ⟨Φ(xi), Φ(xj)⟩ for i, j = 1, . . . n we have to
calculate k(xi, xj), which has exactly the same values. However, there
are two potential reasons why the kernelized setup can be
advantageous:
▶ Speed: We might find and expression for k(xi, xj) that is faster to
calculate than forming Φ(xi) and then ⟨Φ(xi), Φ(xj)⟩.
▶ Flexibility: We construct functions k(x, x′), for which we know
that they corresponds to inner products after some feature
mapping Φ, but we don’t know how to compute Φ.
3 / 28
Kernels
How to use the Kernel Trick
To evaluate a decision function f(x) on an example x, one typically
employs the kernel trick as follows
f(x) = ⟨w, Φ(x)⟩
=
⟨ N∑
i=1
αiΦ(xi), Φ(x)
⟩
=
N∑
i=1
αi ⟨Φ(xi), Φ(x)⟩
=
N∑
i=1
αik(xi, x)
4 / 28
How to proof that a function
is a kernel?
Kernels
Some Definitions
Definition 1.1 (Positive Definite Kernel)
Let X be a nonempty set. A function k : X × X → C is called a
positive definite if and only if
n∑
i=1
n∑
j=1
cicjk(xi, xj) ≥ 0 (1)
for all n ∈ N, {x1, . . . , xn} ⊆ X and {c1, . . . , cn}.
Unfortunately, there is no common use of the preceding definition in
the literature. Indeed, some authors call positive definite function
positive semi-definite, ans strictly positive definite functions are
sometimes called positive definite.
Note:
For fixed x1, x2, . . . , xn ∈ X, then n × n matrix K := [k(xi, xj)]1≤i,j≤n
is often called the Gram Matrix.
6 / 28
Kernels
Mercer Condition
Theorem 1.2
Let X = [a, b] be compact interval and let k : [a, b] × [a, b] → C be
continuous. Then φ is positive definite if and only if
∫ b
a
∫ b
a
c(x)c(y)k(x, y)dxdy ≥ 0 (2)
for each continuous function c : X → C.
7 / 28
Kernels
Theorem 1.3 (Symmetric, positive definite functions are kernels)
A function k : X × X → R is a kernel if and only if is symmetric and
positive definite.
8 / 28
Kernels
Theorem 1.4
Let k1, k2 . . . are arbitrary positive definite kernels in X × X, where X
is not an empty set.
▶ The set of positive definite kernels is a closed convex cone, that is,
1. If α1, α2 ≥ 0, then α1k1 + α2k2 is positive definitive.
2. If k(x, x′
) := lim
n→∞
kn(x, x′
) exists for all x, x′
then k is positive
definitive.
▶ The product k1.k2 is positive definite kernel.
▶ Assume that for i = 1, 2 ki is a positive definite kernel on Xi × Xi,
where Xi is a nonempty set. Then the tensor product k1 ⊗ k2 and
the direct sum k1 ⊕ k2 are positive definite kernels on
(X1 × X2) × (X1 × X2).
▶ Suppose that Y is not an empty set and let f : Y → X any
arbitrary function then k(x, y) = k1(f(x), f(y)) is a positive
definite kernel over Y × Y .
9 / 28
Kernel Families
Kernels Kernel Families
Translation Invariant Kernels
Definition 1.5
A translation invariant kernel is given by
K(x, y) = k(x − y) (3)
where k is a even function in Rn, i.e., k(−x) = k(x) for all x in Rn.
11 / 28
Kernels Kernel Families
Translation Invariant Kernels
Definition 1.6
A function f : (0, ∞) → R is completely monotonic if it is C∞ and, for
all r > 0 and k ≥ 0,
(−1)k
f(k)
(r) ≥ 0 (4)
Here f(k) denotes the k−th derivative of f.
Theorem 1.7
Let X ⊂ Rn, f : (0, ∞) → R and K : X × X → R be defined by
K(x, y) = f(∥x − y∥2). If f is completely monotonic then K is positive
definite.
12 / 28
Kernels Kernel Families
Translation Invariant Kernels
Corollary 1.8
Let c ̸= 0. Then following kernels, defined on a compact domain
X ⊂ Rn, are Mercer Kernels.
▶ Gaussian Kernel or Radial Basis Function (RBF) or
Squared Exponential Kernel (SE)
k(x, y) = exp
(
−
∥x − y∥2
2σ2
)
(5)
▶ Inverse Multiquadratic Kernel
k(x, y) =
(
c2
+ ∥x − y∥2
)−α
, α > 0 (6)
13 / 28
Kernels Kernel Families
Polynomial Kernels
k(x, x′
) = (α⟨x, x′
⟩ + c)d
, α > 0, c ≥ 0, d ∈ Z (7)
14 / 28
Kernels Kernel Families
Non Mercer Kernels
Example 1.9
Let k : X × X → R defined as
k(x, x′
) =
{
1 , ∥x − x′∥ ≥ 1
0 , in other case
(8)
Suppose that k is a Mercer Kernel and set x1 = 1, x2 = 2 and x3 = 3
then the matrix Kij = k(xi, xj) for 1 ≤ i, j ≤ 3 is
K =


1 1 0
1 1 1
0 1 1

 (9)
then the eigenvalues of K are λ1 = (
√
2 − 1)−1 > 0 and
λ2 = (1 −
√
2) < 0. This is a contradiction because all the eigenvalues
of K are positive then we can conclude that k is not a Mercer Kernel.
15 / 28
Kernels Kernel Families
References for Kernels
[3] C. Berg, J. Reus, and P. Ressel. Harmonic Analysis on
Semigroups: Theory of Positive Definite and Related Functions.
Springer Science+Business Media, LLV, 1984.
[9] Felipe Cucker and Ding Xuan Zhou. Learning Theory.
Cambridge University Press, 2007.
[47] Ingo Steinwart and Christmannm Andreas. Support Vector
Machines. 2008.
16 / 28
Support Vector Machines
Applications SVM
Support Vector Machines
w, x + b = 1
w, x + b = −1
w, x + b = 0
margen
Figure: Linear Support Vector Machine
18 / 28
Applications SVM
Primal Problem
Theorem 3.1
The optimization program for the maximum margin classifier is



min
w,b
1
2
∥w∥2
s.a yi(⟨w, xi⟩ + b) ≥ 1, ∀i, 1 ≤ i ≤ m
(10)
19 / 28
Applications SVM
Theorem 3.2
Let F a function defined as:
F : Rm
→ R+
w → F(w) =
1
2
∥w∥2
then following affirmations are hold:
1. F is infinitely differential.
2. The gradient of F is ∇F(w) = w.
3. The Hessian of F is ∇2F(w) = Im×m.
4. The Hessian ∇2F(w) is strictly convex.
20 / 28
Applications SVM
Theorem 3.3 (The dual problem)
The Dual optimization program of (12) is:



max
α
m∑
i=1
αi −
1
2
m∑
i=1
m∑
j=1
αiαjyiyj⟨xi, xj⟩
s.a αi ≥ 0 ∧
m∑
i=1
αiyi = 0, ∀i, 1 ≤ i ≤ m
(11)
where α = (α1, α2, . . . , αm) and the solution for this dual problem will
be denotated by α∗ = (α∗
1, α∗
2, . . . , α∗
m).
21 / 28
Applications SVM
Proof.
The Lagrangianx of the function F is
L(x, b, α) =
1
2
∥w∥2
−
m∑
i=1
αi[yi(⟨w, xi⟩ + b) − 1] (12)
Because of the KKT conditions are hold (F is continuous and
differentiable and the restrictions are also continuous and differentiable)
then we can add the complementary conditions
Stationarity:
∇wL = w −
m∑
i=1
αiyixi = 0 ⇒ w =
m∑
i=1
αiyixi (13)
∇bL = −
m∑
i=1
αiyi = 0 ⇒
m∑
i=1
αiyi = 0 (14)
22 / 28
Applications SVM
Primal feasibility:
yi(⟨w, xi⟩ + b) ≥ 1, ∀i ∈ [1, m] (15)
Dual feasibility:
αi ≥ 0, ∀i ∈ [1, m] (16)
Complementary slackness:
αi[yi(⟨w, xi⟩+b)−1] = 0 ⇒ αi = 0∨yi(⟨w, xi⟩+b) = 1, ∀i ∈ [1, m] (17)
L(w, b, α) =
1
2
m∑
i=1
αiyixi
2
−
m∑
i=1
m∑
j=1
αiαjyiyj⟨xi, xj⟩
=− 1
2
∑m
i=1
∑m
j=1 αiαjyiyj⟨xi,xj⟩
−
m∑
i=1
αiyib
=0
+
m∑
i=1
αi
(18)
then
L(w, b, α) =
m∑
i=1
αi −
1
2
m∑
i=1
m∑
j=1
αiαjyiyj⟨xi, xj⟩ (19)
23 / 28
Applications SVM
Theorem 3.4
Let G a function defined as:
G: Rm
→ R
α → G(α) = αt
Im×m −
1
2
αt
Aα
where α = (α1, α2, . . . , αm) y A = [yiyj⟨xi, xj⟩]1≤i,j≤m in Rm×m then
the following affirmations are hold:
1. The A is symmetric.
2. The function G is differentiable and
∂G(α)
∂α
= Im×m − Aα.
3. The function G is twice differentiable and
∂2G(α)
∂α2
= −A.
4. The function G is a concave function.
24 / 28
Applications SVM
Linear Support Vector Machines
We will called Support Vector Machines to the decision function defined
by
f(x) = sign (⟨w, x⟩ + b) = sign
( m∑
i=1
α∗
i yi⟨xi, x⟩ + b
)
(20)
Where
▶ m is the number of training points.
▶ α∗
i are the lagrange multipliers of the dual problem (13).
25 / 28
Applications Non Linear SVM
Non Linear Support Vector Machines
We will called Non Linear Support Vector Machines to the decision
function defined by
f(x) = sign (⟨w, Φ(x)⟩ + b) = sign
( m∑
i=1
α∗
i yi⟨Φ(xi), Φ(x)⟩ + b
)
(21)
Where
▶ m is the number of training points.
▶ α∗
i are the lagrange multipliers of the dual problem (13).
26 / 28
Applications Non Linear SVM
Applying the Kernel Trick
Using the kernel trick we can replace ⟨Φ(xi), Φ(x)⟩ by a kernel k(xi, x)
f(x) = sign
( m∑
i=1
α∗
i yik(xi, x) + b
)
(22)
Where
▶ m is the number of training points.
▶ α∗
i are the lagrange multipliers of the dual problem (13).
27 / 28
Applications Non Linear SVM
References for Support Vector Machines
[31] Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar.
Foundations of Machine Learning. The MIT Press, 2012.
28 / 28

Contenu connexe

Tendances

Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMohsin Ul Haq
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximizationbutest
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Rohit Kumar
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learningHaris Jamil
 
Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & TrackingAkshay Gujarathi
 
Deep Learning With Neural Networks
Deep Learning With Neural NetworksDeep Learning With Neural Networks
Deep Learning With Neural NetworksAniket Maurya
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachinePulse
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1Srinivasan R
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extractionRushin Shah
 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram ProcessingAmnaakhaan
 
discrete wavelet transform
discrete wavelet transformdiscrete wavelet transform
discrete wavelet transformpiyush_11
 
Chapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier TransformationChapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier TransformationVarun Ojha
 

Tendances (20)

Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector Machine
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Neural networks introduction
Neural networks introductionNeural networks introduction
Neural networks introduction
 
Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.Pattern recognition and Machine Learning.
Pattern recognition and Machine Learning.
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Object Detection & Tracking
Object Detection & TrackingObject Detection & Tracking
Object Detection & Tracking
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Deep Learning With Neural Networks
Deep Learning With Neural NetworksDeep Learning With Neural Networks
Deep Learning With Neural Networks
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
Machine learning Lecture 1
Machine learning Lecture 1Machine learning Lecture 1
Machine learning Lecture 1
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extraction
 
Histogram Processing
Histogram ProcessingHistogram Processing
Histogram Processing
 
discrete wavelet transform
discrete wavelet transformdiscrete wavelet transform
discrete wavelet transform
 
Wiener Filter
Wiener FilterWiener Filter
Wiener Filter
 
Chapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier TransformationChapter 5 Image Processing: Fourier Transformation
Chapter 5 Image Processing: Fourier Transformation
 

En vedette

Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesadil raja
 
Support Vector Machines: concetti ed esempi
Support Vector Machines: concetti ed esempiSupport Vector Machines: concetti ed esempi
Support Vector Machines: concetti ed esempiGioele Ciaparrone
 
Word2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimWord2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimEdgar Marca
 
Support Vector Machine without tears
Support Vector Machine without tearsSupport Vector Machine without tears
Support Vector Machine without tearsAnkit Sharma
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 

En vedette (6)

Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Support Vector Machines: concetti ed esempi
Support Vector Machines: concetti ed esempiSupport Vector Machines: concetti ed esempi
Support Vector Machines: concetti ed esempi
 
Word2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimWord2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensim
 
Support Vector Machine without tears
Support Vector Machine without tearsSupport Vector Machine without tears
Support Vector Machine without tears
 
Support Vector machine
Support Vector machineSupport Vector machine
Support Vector machine
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 

Similaire à Kernels and Support Vector Machines

MLHEP Lectures - day 3, basic track
MLHEP Lectures - day 3, basic trackMLHEP Lectures - day 3, basic track
MLHEP Lectures - day 3, basic trackarogozhnikov
 
Existence Theory for Second Order Nonlinear Functional Random Differential Eq...
Existence Theory for Second Order Nonlinear Functional Random Differential Eq...Existence Theory for Second Order Nonlinear Functional Random Differential Eq...
Existence Theory for Second Order Nonlinear Functional Random Differential Eq...IOSR Journals
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Andrei rusu-2013-amaa-workshop
Andrei rusu-2013-amaa-workshopAndrei rusu-2013-amaa-workshop
Andrei rusu-2013-amaa-workshopAndries Rusu
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationAlexander Litvinenko
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Alexander Litvinenko
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfAlexander Litvinenko
 
A new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsA new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsFrank Nielsen
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionCharles Deledalle
 
A Generalized Metric Space and Related Fixed Point Theorems
A Generalized Metric Space and Related Fixed Point TheoremsA Generalized Metric Space and Related Fixed Point Theorems
A Generalized Metric Space and Related Fixed Point TheoremsIRJET Journal
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningzukun
 
The Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionThe Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionPedro222284
 
Conformable Chebyshev differential equation of first kind
Conformable Chebyshev differential equation of first kindConformable Chebyshev differential equation of first kind
Conformable Chebyshev differential equation of first kindIJECEIAES
 
Integral Calculus Anti Derivatives reviewer
Integral Calculus Anti Derivatives reviewerIntegral Calculus Anti Derivatives reviewer
Integral Calculus Anti Derivatives reviewerJoshuaAgcopra
 
8517ijaia06
8517ijaia068517ijaia06
8517ijaia06ijaia
 
Combinatorial optimization CO-4
Combinatorial optimization CO-4Combinatorial optimization CO-4
Combinatorial optimization CO-4man003
 
Indefinite Integral
Indefinite IntegralIndefinite Integral
Indefinite IntegralJelaiAujero
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Tomoya Murata
 
Finance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdfFinance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdfCarlosLazo45
 

Similaire à Kernels and Support Vector Machines (20)

MLHEP Lectures - day 3, basic track
MLHEP Lectures - day 3, basic trackMLHEP Lectures - day 3, basic track
MLHEP Lectures - day 3, basic track
 
Existence Theory for Second Order Nonlinear Functional Random Differential Eq...
Existence Theory for Second Order Nonlinear Functional Random Differential Eq...Existence Theory for Second Order Nonlinear Functional Random Differential Eq...
Existence Theory for Second Order Nonlinear Functional Random Differential Eq...
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Andrei rusu-2013-amaa-workshop
Andrei rusu-2013-amaa-workshopAndrei rusu-2013-amaa-workshop
Andrei rusu-2013-amaa-workshop
 
Tensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantificationTensor Train data format for uncertainty quantification
Tensor Train data format for uncertainty quantification
 
Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...Hierarchical matrices for approximating large covariance matries and computin...
Hierarchical matrices for approximating large covariance matries and computin...
 
Litv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdfLitv_Denmark_Weak_Supervised_Learning.pdf
Litv_Denmark_Weak_Supervised_Learning.pdf
 
A new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsA new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributions
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
A Generalized Metric Space and Related Fixed Point Theorems
A Generalized Metric Space and Related Fixed Point TheoremsA Generalized Metric Space and Related Fixed Point Theorems
A Generalized Metric Space and Related Fixed Point Theorems
 
NIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learningNIPS2010: optimization algorithms in machine learning
NIPS2010: optimization algorithms in machine learning
 
The Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability DistributionThe Multivariate Gaussian Probability Distribution
The Multivariate Gaussian Probability Distribution
 
Conformable Chebyshev differential equation of first kind
Conformable Chebyshev differential equation of first kindConformable Chebyshev differential equation of first kind
Conformable Chebyshev differential equation of first kind
 
Integral Calculus Anti Derivatives reviewer
Integral Calculus Anti Derivatives reviewerIntegral Calculus Anti Derivatives reviewer
Integral Calculus Anti Derivatives reviewer
 
8517ijaia06
8517ijaia068517ijaia06
8517ijaia06
 
Combinatorial optimization CO-4
Combinatorial optimization CO-4Combinatorial optimization CO-4
Combinatorial optimization CO-4
 
ma112011id535
ma112011id535ma112011id535
ma112011id535
 
Indefinite Integral
Indefinite IntegralIndefinite Integral
Indefinite Integral
 
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
 
Finance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdfFinance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdf
 

Plus de Edgar Marca

Python Packages for Web Data Extraction and Analysis
Python Packages for Web Data Extraction and AnalysisPython Packages for Web Data Extraction and Analysis
Python Packages for Web Data Extraction and AnalysisEdgar Marca
 
The Kernel Trick
The Kernel TrickThe Kernel Trick
The Kernel TrickEdgar Marca
 
Aprendizaje de Maquina y Aplicaciones
Aprendizaje de Maquina y AplicacionesAprendizaje de Maquina y Aplicaciones
Aprendizaje de Maquina y AplicacionesEdgar Marca
 
Tilemill: Una Herramienta Open Source para diseñar mapas
Tilemill: Una Herramienta Open Source para diseñar mapasTilemill: Una Herramienta Open Source para diseñar mapas
Tilemill: Una Herramienta Open Source para diseñar mapasEdgar Marca
 
Buenas Aplicaciones y Programas con Datos Abiertos / Publicos.
Buenas Aplicaciones y Programas con Datos Abiertos / Publicos.Buenas Aplicaciones y Programas con Datos Abiertos / Publicos.
Buenas Aplicaciones y Programas con Datos Abiertos / Publicos.Edgar Marca
 
Theming cck-n-views
Theming cck-n-viewsTheming cck-n-views
Theming cck-n-viewsEdgar Marca
 

Plus de Edgar Marca (6)

Python Packages for Web Data Extraction and Analysis
Python Packages for Web Data Extraction and AnalysisPython Packages for Web Data Extraction and Analysis
Python Packages for Web Data Extraction and Analysis
 
The Kernel Trick
The Kernel TrickThe Kernel Trick
The Kernel Trick
 
Aprendizaje de Maquina y Aplicaciones
Aprendizaje de Maquina y AplicacionesAprendizaje de Maquina y Aplicaciones
Aprendizaje de Maquina y Aplicaciones
 
Tilemill: Una Herramienta Open Source para diseñar mapas
Tilemill: Una Herramienta Open Source para diseñar mapasTilemill: Una Herramienta Open Source para diseñar mapas
Tilemill: Una Herramienta Open Source para diseñar mapas
 
Buenas Aplicaciones y Programas con Datos Abiertos / Publicos.
Buenas Aplicaciones y Programas con Datos Abiertos / Publicos.Buenas Aplicaciones y Programas con Datos Abiertos / Publicos.
Buenas Aplicaciones y Programas con Datos Abiertos / Publicos.
 
Theming cck-n-views
Theming cck-n-viewsTheming cck-n-views
Theming cck-n-views
 

Dernier

development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Youngkajalvid75
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 

Dernier (20)

development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 

Kernels and Support Vector Machines

  • 1. Seminar 2 Kernels and Support Vector Machines Edgar Marca Supervisor: DSc. André M.S. Barreto Petrópolis, Rio de Janeiro - Brazil September 2nd, 2015 1 / 28
  • 3. Kernels Why Kernalize? At first sight, introducing k(x, x′) has not improved our situation. Instead of calculating ⟨Φ(xi), Φ(xj)⟩ for i, j = 1, . . . n we have to calculate k(xi, xj), which has exactly the same values. However, there are two potential reasons why the kernelized setup can be advantageous: ▶ Speed: We might find and expression for k(xi, xj) that is faster to calculate than forming Φ(xi) and then ⟨Φ(xi), Φ(xj)⟩. ▶ Flexibility: We construct functions k(x, x′), for which we know that they corresponds to inner products after some feature mapping Φ, but we don’t know how to compute Φ. 3 / 28
  • 4. Kernels How to use the Kernel Trick To evaluate a decision function f(x) on an example x, one typically employs the kernel trick as follows f(x) = ⟨w, Φ(x)⟩ = ⟨ N∑ i=1 αiΦ(xi), Φ(x) ⟩ = N∑ i=1 αi ⟨Φ(xi), Φ(x)⟩ = N∑ i=1 αik(xi, x) 4 / 28
  • 5. How to proof that a function is a kernel?
  • 6. Kernels Some Definitions Definition 1.1 (Positive Definite Kernel) Let X be a nonempty set. A function k : X × X → C is called a positive definite if and only if n∑ i=1 n∑ j=1 cicjk(xi, xj) ≥ 0 (1) for all n ∈ N, {x1, . . . , xn} ⊆ X and {c1, . . . , cn}. Unfortunately, there is no common use of the preceding definition in the literature. Indeed, some authors call positive definite function positive semi-definite, ans strictly positive definite functions are sometimes called positive definite. Note: For fixed x1, x2, . . . , xn ∈ X, then n × n matrix K := [k(xi, xj)]1≤i,j≤n is often called the Gram Matrix. 6 / 28
  • 7. Kernels Mercer Condition Theorem 1.2 Let X = [a, b] be compact interval and let k : [a, b] × [a, b] → C be continuous. Then φ is positive definite if and only if ∫ b a ∫ b a c(x)c(y)k(x, y)dxdy ≥ 0 (2) for each continuous function c : X → C. 7 / 28
  • 8. Kernels Theorem 1.3 (Symmetric, positive definite functions are kernels) A function k : X × X → R is a kernel if and only if is symmetric and positive definite. 8 / 28
  • 9. Kernels Theorem 1.4 Let k1, k2 . . . are arbitrary positive definite kernels in X × X, where X is not an empty set. ▶ The set of positive definite kernels is a closed convex cone, that is, 1. If α1, α2 ≥ 0, then α1k1 + α2k2 is positive definitive. 2. If k(x, x′ ) := lim n→∞ kn(x, x′ ) exists for all x, x′ then k is positive definitive. ▶ The product k1.k2 is positive definite kernel. ▶ Assume that for i = 1, 2 ki is a positive definite kernel on Xi × Xi, where Xi is a nonempty set. Then the tensor product k1 ⊗ k2 and the direct sum k1 ⊕ k2 are positive definite kernels on (X1 × X2) × (X1 × X2). ▶ Suppose that Y is not an empty set and let f : Y → X any arbitrary function then k(x, y) = k1(f(x), f(y)) is a positive definite kernel over Y × Y . 9 / 28
  • 11. Kernels Kernel Families Translation Invariant Kernels Definition 1.5 A translation invariant kernel is given by K(x, y) = k(x − y) (3) where k is a even function in Rn, i.e., k(−x) = k(x) for all x in Rn. 11 / 28
  • 12. Kernels Kernel Families Translation Invariant Kernels Definition 1.6 A function f : (0, ∞) → R is completely monotonic if it is C∞ and, for all r > 0 and k ≥ 0, (−1)k f(k) (r) ≥ 0 (4) Here f(k) denotes the k−th derivative of f. Theorem 1.7 Let X ⊂ Rn, f : (0, ∞) → R and K : X × X → R be defined by K(x, y) = f(∥x − y∥2). If f is completely monotonic then K is positive definite. 12 / 28
  • 13. Kernels Kernel Families Translation Invariant Kernels Corollary 1.8 Let c ̸= 0. Then following kernels, defined on a compact domain X ⊂ Rn, are Mercer Kernels. ▶ Gaussian Kernel or Radial Basis Function (RBF) or Squared Exponential Kernel (SE) k(x, y) = exp ( − ∥x − y∥2 2σ2 ) (5) ▶ Inverse Multiquadratic Kernel k(x, y) = ( c2 + ∥x − y∥2 )−α , α > 0 (6) 13 / 28
  • 14. Kernels Kernel Families Polynomial Kernels k(x, x′ ) = (α⟨x, x′ ⟩ + c)d , α > 0, c ≥ 0, d ∈ Z (7) 14 / 28
  • 15. Kernels Kernel Families Non Mercer Kernels Example 1.9 Let k : X × X → R defined as k(x, x′ ) = { 1 , ∥x − x′∥ ≥ 1 0 , in other case (8) Suppose that k is a Mercer Kernel and set x1 = 1, x2 = 2 and x3 = 3 then the matrix Kij = k(xi, xj) for 1 ≤ i, j ≤ 3 is K =   1 1 0 1 1 1 0 1 1   (9) then the eigenvalues of K are λ1 = ( √ 2 − 1)−1 > 0 and λ2 = (1 − √ 2) < 0. This is a contradiction because all the eigenvalues of K are positive then we can conclude that k is not a Mercer Kernel. 15 / 28
  • 16. Kernels Kernel Families References for Kernels [3] C. Berg, J. Reus, and P. Ressel. Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions. Springer Science+Business Media, LLV, 1984. [9] Felipe Cucker and Ding Xuan Zhou. Learning Theory. Cambridge University Press, 2007. [47] Ingo Steinwart and Christmannm Andreas. Support Vector Machines. 2008. 16 / 28
  • 18. Applications SVM Support Vector Machines w, x + b = 1 w, x + b = −1 w, x + b = 0 margen Figure: Linear Support Vector Machine 18 / 28
  • 19. Applications SVM Primal Problem Theorem 3.1 The optimization program for the maximum margin classifier is    min w,b 1 2 ∥w∥2 s.a yi(⟨w, xi⟩ + b) ≥ 1, ∀i, 1 ≤ i ≤ m (10) 19 / 28
  • 20. Applications SVM Theorem 3.2 Let F a function defined as: F : Rm → R+ w → F(w) = 1 2 ∥w∥2 then following affirmations are hold: 1. F is infinitely differential. 2. The gradient of F is ∇F(w) = w. 3. The Hessian of F is ∇2F(w) = Im×m. 4. The Hessian ∇2F(w) is strictly convex. 20 / 28
  • 21. Applications SVM Theorem 3.3 (The dual problem) The Dual optimization program of (12) is:    max α m∑ i=1 αi − 1 2 m∑ i=1 m∑ j=1 αiαjyiyj⟨xi, xj⟩ s.a αi ≥ 0 ∧ m∑ i=1 αiyi = 0, ∀i, 1 ≤ i ≤ m (11) where α = (α1, α2, . . . , αm) and the solution for this dual problem will be denotated by α∗ = (α∗ 1, α∗ 2, . . . , α∗ m). 21 / 28
  • 22. Applications SVM Proof. The Lagrangianx of the function F is L(x, b, α) = 1 2 ∥w∥2 − m∑ i=1 αi[yi(⟨w, xi⟩ + b) − 1] (12) Because of the KKT conditions are hold (F is continuous and differentiable and the restrictions are also continuous and differentiable) then we can add the complementary conditions Stationarity: ∇wL = w − m∑ i=1 αiyixi = 0 ⇒ w = m∑ i=1 αiyixi (13) ∇bL = − m∑ i=1 αiyi = 0 ⇒ m∑ i=1 αiyi = 0 (14) 22 / 28
  • 23. Applications SVM Primal feasibility: yi(⟨w, xi⟩ + b) ≥ 1, ∀i ∈ [1, m] (15) Dual feasibility: αi ≥ 0, ∀i ∈ [1, m] (16) Complementary slackness: αi[yi(⟨w, xi⟩+b)−1] = 0 ⇒ αi = 0∨yi(⟨w, xi⟩+b) = 1, ∀i ∈ [1, m] (17) L(w, b, α) = 1 2 m∑ i=1 αiyixi 2 − m∑ i=1 m∑ j=1 αiαjyiyj⟨xi, xj⟩ =− 1 2 ∑m i=1 ∑m j=1 αiαjyiyj⟨xi,xj⟩ − m∑ i=1 αiyib =0 + m∑ i=1 αi (18) then L(w, b, α) = m∑ i=1 αi − 1 2 m∑ i=1 m∑ j=1 αiαjyiyj⟨xi, xj⟩ (19) 23 / 28
  • 24. Applications SVM Theorem 3.4 Let G a function defined as: G: Rm → R α → G(α) = αt Im×m − 1 2 αt Aα where α = (α1, α2, . . . , αm) y A = [yiyj⟨xi, xj⟩]1≤i,j≤m in Rm×m then the following affirmations are hold: 1. The A is symmetric. 2. The function G is differentiable and ∂G(α) ∂α = Im×m − Aα. 3. The function G is twice differentiable and ∂2G(α) ∂α2 = −A. 4. The function G is a concave function. 24 / 28
  • 25. Applications SVM Linear Support Vector Machines We will called Support Vector Machines to the decision function defined by f(x) = sign (⟨w, x⟩ + b) = sign ( m∑ i=1 α∗ i yi⟨xi, x⟩ + b ) (20) Where ▶ m is the number of training points. ▶ α∗ i are the lagrange multipliers of the dual problem (13). 25 / 28
  • 26. Applications Non Linear SVM Non Linear Support Vector Machines We will called Non Linear Support Vector Machines to the decision function defined by f(x) = sign (⟨w, Φ(x)⟩ + b) = sign ( m∑ i=1 α∗ i yi⟨Φ(xi), Φ(x)⟩ + b ) (21) Where ▶ m is the number of training points. ▶ α∗ i are the lagrange multipliers of the dual problem (13). 26 / 28
  • 27. Applications Non Linear SVM Applying the Kernel Trick Using the kernel trick we can replace ⟨Φ(xi), Φ(x)⟩ by a kernel k(xi, x) f(x) = sign ( m∑ i=1 α∗ i yik(xi, x) + b ) (22) Where ▶ m is the number of training points. ▶ α∗ i are the lagrange multipliers of the dual problem (13). 27 / 28
  • 28. Applications Non Linear SVM References for Support Vector Machines [31] Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of Machine Learning. The MIT Press, 2012. 28 / 28