SlideShare a Scribd company logo
1 of 35
Download to read offline
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Statistical Pattern Recognition
Lecture4
Bayesian Learning
Dr Zohreh Azimifar
School of Electrical and Computer Engineering
Shiraz University
Fall2014
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 1 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Table of contents
1 Introduction
2 Generative Learning vs Discriminative Learning
Generative Learning and Discriminative Learning
3 Linear Discriminant Analysis
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
4 Quadratic Discriminant Analysis
Quadratic Discriminant Analysis
Analysis of QDA
5 GLAD and QDA, Another point of view
GLAD and QDA, Another point of view
6 Naive Bayes
Naive Bayes
Naive Bayes: An Example
7 Lecture Summary
Summary
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 2 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Introduction
Classification based on the theory Bayesian Learning
P(y = 0|X) ≷y=0
y=1 P(y = 1|X)
Classification involves determining P(y|X), from different
perspectives.
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 3 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Generative Learning and Discriminative Learning
Generative Learning and Discriminative Learning
1 Discriminative Learning:
Direct learning of P(y|X).
Modelling of decision boundary, to which side a new sample is
assigned.
Logistic and softmax regression are called discriminative learners.
2 Generative Learning
Explicit modelling of each class separately.
Compare new sample with each class probability, based on Bayesian
rule:
P(y|X) =
Likelihood
z }| {
P(X|y)
Prior
z }| {
P(y)
P(X)
| {z }
normalizing factor
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 4 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Model P(y) and P(X|y) for each class y:
P(y) = φ
1{y=1}
1 φ
1{y=2}
2 · · · φ1{y=c}
c
P(X|y = i) =
1
(
√
2π)n|Σ|
1
2
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))
Parameter set: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 5 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Take partial derivative in terms of each individual parameter:
φMLE
i =
Pm
j=1 1{y(j)
= i}
m
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Take partial derivative in terms of each individual parameter:
φMLE
i =
Pm
j=1 1{y(j)
= i}
m
µMLE
i =
Pm
j=1 1{y(j)
= i}X(j)
Pm
j=1 1{y(j) = i}
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Take partial derivative in terms of each individual parameter:
φMLE
i =
Pm
j=1 1{y(j)
= i}
m
µMLE
i =
Pm
j=1 1{y(j)
= i}X(j)
Pm
j=1 1{y(j) = i}
ΣMLE
=
1
m
m
X
j=1
(X(j)
− µy(j) )(X(j)
− µy(j) )T
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Determine class label of a new sample Xnew
:
ynew
= argmaxy P(y|X)
= argmaxy
P(X|y)P(y)
P(X)
= argmaxy P(X|y)P(y)
Note that P(X|y) is a class dependent density.
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 7 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary is a line, a plane, or a hyper-plane. Why?
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary is a line, a plane, or a hyper-plane. Why?
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary is a line, a plane, or a hyper-plane. Why?
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
1
(2π)
n
2 |Σ|
1
2
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i)
=
1
(2π)
n
2 |Σ|
1
2
exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
−1
2
(X − µi )T
Σ−1
(X − µi ) + log P(y = i) =
−1
2
(X − µj )T
Σ−1
(X − µj ) + log P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
−1
2
(X − µi )T
Σ−1
(X − µi ) + log P(y = i) =
−1
2
(X − µj )T
Σ−1
(X − µj ) + log P(y = j)
⇒ log
P(y = i)
P(y = j)
−
1
2
[XT
Σ−1
X − 2XT
Σ−1
µi + µT
i Σ−1
µi ]
+
1
2
[XT
Σ−1
X − 2XT
Σ−1
µj + µT
j Σ−1
µj ] = 0
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
−1
2
(X − µi )T
Σ−1
(X − µi ) + log P(y = i) =
−1
2
(X − µj )T
Σ−1
(X − µj ) + log P(y = j)
⇒ log
P(y = i)
P(y = j)
−
1
2
[XT
Σ−1
X − 2XT
Σ−1
µi + µT
i Σ−1
µi ]
+
1
2
[XT
Σ−1
X − 2XT
Σ−1
µj + µT
j Σ−1
µj ] = 0
⇒ XT
Σ−1
(µi − µj )
| {z }
aX
+
1
2
µT
i Σ−1
µi −
1
2
µT
j Σ−1
µj + log
P(y = i)
P(y = j)
| {z }
b
= 0
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA when Σ = σ2
I
Classes are of identical distribution, but different means.
Cross-section of classes distribution is spherical.
Decision boundary is linear.
Called classifier with nearest Euclidean distance to the class mean,
when?
Σ =

1 0
0 1

Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 10 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA when Σ is not identity
Classes are of identical distribution, but different means.
Cross-section of classes distribution is ellipsoidal.
Decision boundary is linear.
Σ =

0.5 0
0 1.5

Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 11 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA with arbitrary Σ
Classes are of identical distribution, but different means.
Cross-section of classes distribution is ellipsoidal. Linear decision
boundary.
Classes are aligned with direction of covariance eigenvectors.
Σ =

0.5 0.2
0.2 1.5

Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA with arbitrary Σ
Classes are of identical distribution, but different means.
Cross-section of classes distribution is ellipsoidal. Linear decision
boundary.
Classes are aligned with direction of covariance eigenvectors.
Σ =

0.5 0.2
0.2 1.5

Called classifier with nearest Mahalanobis distance to the class
mean, when? Dist(X, µi) = (X − µi)T
Σ−1
(X − µi)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Another generative learning model; a Bayesian classifier
Here, classes are of multinomial distribution, and likelihoods are
multivariate Gaussian with separate covariance Σi .
Decision boundary becomes non-linear, Why?
P(X|y = i) =
1
(
√
2π)n|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 13 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary is parabolic:
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary is parabolic:
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
1
(2π)
n
2 |Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
(2π)
n
2 |Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
1
|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
|Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
1
|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
|Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
−1
2
log|Σi | −
1
2
(X − µi )T
Σ−1
i (X − µi ) + logP(y = i)
=
−1
2
log|Σj | −
1
2
(X − µj )T
Σ−1
j (X − µj ) + logP(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
1
|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
|Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
−1
2
log|Σi | −
1
2
(X − µi )T
Σ−1
i (X − µi ) + logP(y = i)
=
−1
2
log|Σj | −
1
2
(X − µj )T
Σ−1
j (X − µj ) + logP(y = j)
⇒ log
P(y = i)
P(y = j)
−
1
2
log
|Σi |
|Σj |
−
1
2
[XT
Σ−1
i X + µT
i Σ−1
i µi −
2XT
Σ−1
i µi − XT
Σ−1
j X − µT
j Σ−1
j µj + 2XT
Σ−1
j µj ] = 0
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
log
P(y = i)
P(y = j)
−
1
2
log
Σi

More Related Content

What's hot

RGA of Cayley graphs
RGA of Cayley graphsRGA of Cayley graphs
RGA of Cayley graphs
John Owen
 
Introduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun NarayananIntroduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun Narayanan
Arjun Narayanan
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Anmol Dwivedi
 

What's hot (10)

A STUDY ON L-FUZZY NORMAL SUBl -GROUP
A STUDY ON L-FUZZY NORMAL SUBl -GROUPA STUDY ON L-FUZZY NORMAL SUBl -GROUP
A STUDY ON L-FUZZY NORMAL SUBl -GROUP
 
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesBayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
 
RGA of Cayley graphs
RGA of Cayley graphsRGA of Cayley graphs
RGA of Cayley graphs
 
Introduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun NarayananIntroduction to second gradient theory of elasticity - Arjun Narayanan
Introduction to second gradient theory of elasticity - Arjun Narayanan
 
On the proof theory for Description Logics
On the proof theory for Description LogicsOn the proof theory for Description Logics
On the proof theory for Description Logics
 
A new generalized lindley distribution
A new generalized lindley distributionA new generalized lindley distribution
A new generalized lindley distribution
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian Nonparametrics
 
Lec 17
Lec  17Lec  17
Lec 17
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
 
A Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian NonparametricsA Gentle Introduction to Bayesian Nonparametrics
A Gentle Introduction to Bayesian Nonparametrics
 

Similar to Dr azimifar pattern recognition lect4

Model complexity
Model complexityModel complexity
Model complexity
Deb Roy
 
Self-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive ModelingSelf-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive Modeling
Hui Yang
 

Similar to Dr azimifar pattern recognition lect4 (19)

Model complexity
Model complexityModel complexity
Model complexity
 
Reading "Bayesian measures of model complexity and fit"
Reading "Bayesian measures of model complexity and fit"Reading "Bayesian measures of model complexity and fit"
Reading "Bayesian measures of model complexity and fit"
 
Nec 602 unit ii Random Variables and Random process
Nec 602 unit ii Random Variables and Random processNec 602 unit ii Random Variables and Random process
Nec 602 unit ii Random Variables and Random process
 
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
 
DIC
DICDIC
DIC
 
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
 
Type-based Dependency Analysis
Type-based Dependency AnalysisType-based Dependency Analysis
Type-based Dependency Analysis
 
Hypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rulesHypothesis testings on individualized treatment rules
Hypothesis testings on individualized treatment rules
 
Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
asymptotics of ABC
asymptotics of ABCasymptotics of ABC
asymptotics of ABC
 
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
graphicascaaaaaaaaaa dsaaaaal models.pdf
graphicascaaaaaaaaaa dsaaaaal models.pdfgraphicascaaaaaaaaaa dsaaaaal models.pdf
graphicascaaaaaaaaaa dsaaaaal models.pdf
 
PAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ WarwickPAWL - GPU meeting @ Warwick
PAWL - GPU meeting @ Warwick
 
Self-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive ModelingSelf-organizing Network for Variable Clustering and Predictive Modeling
Self-organizing Network for Variable Clustering and Predictive Modeling
 
Gradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation GraphsGradient Estimation Using Stochastic Computation Graphs
Gradient Estimation Using Stochastic Computation Graphs
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimation
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
 

More from Zahra Amini

More from Zahra Amini (12)

Dip azimifar enhancement_l05_2020
Dip azimifar enhancement_l05_2020Dip azimifar enhancement_l05_2020
Dip azimifar enhancement_l05_2020
 
Dip azimifar enhancement_l04_2020
Dip azimifar enhancement_l04_2020Dip azimifar enhancement_l04_2020
Dip azimifar enhancement_l04_2020
 
Dip azimifar enhancement_l03_2020
Dip azimifar enhancement_l03_2020Dip azimifar enhancement_l03_2020
Dip azimifar enhancement_l03_2020
 
Dip azimifar enhancement_l02_2020
Dip azimifar enhancement_l02_2020Dip azimifar enhancement_l02_2020
Dip azimifar enhancement_l02_2020
 
Dip azimifar enhancement_l01_2020
Dip azimifar enhancement_l01_2020Dip azimifar enhancement_l01_2020
Dip azimifar enhancement_l01_2020
 
Ch 1-3 nn learning 1-7
Ch 1-3 nn learning 1-7Ch 1-3 nn learning 1-7
Ch 1-3 nn learning 1-7
 
Ch 1-2 NN classifier
Ch 1-2 NN classifierCh 1-2 NN classifier
Ch 1-2 NN classifier
 
Ch 1-1 introduction
Ch 1-1 introductionCh 1-1 introduction
Ch 1-1 introduction
 
Lecture 8
Lecture 8Lecture 8
Lecture 8
 
Kernel estimation(ref)
Kernel estimation(ref)Kernel estimation(ref)
Kernel estimation(ref)
 
Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2
 
Dr azimifar pattern recognition lect1
Dr azimifar pattern recognition lect1Dr azimifar pattern recognition lect1
Dr azimifar pattern recognition lect1
 

Recently uploaded

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
chumtiyababu
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 

Recently uploaded (20)

Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxOrlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 

Dr azimifar pattern recognition lect4

  • 1. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Statistical Pattern Recognition Lecture4 Bayesian Learning Dr Zohreh Azimifar School of Electrical and Computer Engineering Shiraz University Fall2014 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 1 / 22
  • 2. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Table of contents 1 Introduction 2 Generative Learning vs Discriminative Learning Generative Learning and Discriminative Learning 3 Linear Discriminant Analysis Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA 4 Quadratic Discriminant Analysis Quadratic Discriminant Analysis Analysis of QDA 5 GLAD and QDA, Another point of view GLAD and QDA, Another point of view 6 Naive Bayes Naive Bayes Naive Bayes: An Example 7 Lecture Summary Summary Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 2 / 22
  • 3. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Introduction Classification based on the theory Bayesian Learning P(y = 0|X) ≷y=0 y=1 P(y = 1|X) Classification involves determining P(y|X), from different perspectives. Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 3 / 22
  • 4. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Generative Learning and Discriminative Learning Generative Learning and Discriminative Learning 1 Discriminative Learning: Direct learning of P(y|X). Modelling of decision boundary, to which side a new sample is assigned. Logistic and softmax regression are called discriminative learners. 2 Generative Learning Explicit modelling of each class separately. Compare new sample with each class probability, based on Bayesian rule: P(y|X) = Likelihood z }| { P(X|y) Prior z }| { P(y) P(X) | {z } normalizing factor Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 4 / 22
  • 5. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Model P(y) and P(X|y) for each class y: P(y) = φ 1{y=1} 1 φ 1{y=2} 2 · · · φ1{y=c} c P(X|y = i) = 1 ( √ 2π)n|Σ| 1 2 exp( −1 2 (X − µi )T Σ−1 (X − µi )) Parameter set: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 5 / 22
  • 6. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 7. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Take partial derivative in terms of each individual parameter: φMLE i = Pm j=1 1{y(j) = i} m Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 8. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Take partial derivative in terms of each individual parameter: φMLE i = Pm j=1 1{y(j) = i} m µMLE i = Pm j=1 1{y(j) = i}X(j) Pm j=1 1{y(j) = i} Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 9. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Take partial derivative in terms of each individual parameter: φMLE i = Pm j=1 1{y(j) = i} m µMLE i = Pm j=1 1{y(j) = i}X(j) Pm j=1 1{y(j) = i} ΣMLE = 1 m m X j=1 (X(j) − µy(j) )(X(j) − µy(j) )T Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 10. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Determine class label of a new sample Xnew : ynew = argmaxy P(y|X) = argmaxy P(X|y)P(y) P(X) = argmaxy P(X|y)P(y) Note that P(X|y) is a class dependent density. Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 7 / 22
  • 11. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary is a line, a plane, or a hyper-plane. Why? P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
  • 12. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary is a line, a plane, or a hyper-plane. Why? P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
  • 13. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary is a line, a plane, or a hyper-plane. Why? P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) 1 (2π) n 2 |Σ| 1 2 exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = 1 (2π) n 2 |Σ| 1 2 exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
  • 14. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 15. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) −1 2 (X − µi )T Σ−1 (X − µi ) + log P(y = i) = −1 2 (X − µj )T Σ−1 (X − µj ) + log P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 16. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) −1 2 (X − µi )T Σ−1 (X − µi ) + log P(y = i) = −1 2 (X − µj )T Σ−1 (X − µj ) + log P(y = j) ⇒ log P(y = i) P(y = j) − 1 2 [XT Σ−1 X − 2XT Σ−1 µi + µT i Σ−1 µi ] + 1 2 [XT Σ−1 X − 2XT Σ−1 µj + µT j Σ−1 µj ] = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 17. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) −1 2 (X − µi )T Σ−1 (X − µi ) + log P(y = i) = −1 2 (X − µj )T Σ−1 (X − µj ) + log P(y = j) ⇒ log P(y = i) P(y = j) − 1 2 [XT Σ−1 X − 2XT Σ−1 µi + µT i Σ−1 µi ] + 1 2 [XT Σ−1 X − 2XT Σ−1 µj + µT j Σ−1 µj ] = 0 ⇒ XT Σ−1 (µi − µj ) | {z } aX + 1 2 µT i Σ−1 µi − 1 2 µT j Σ−1 µj + log P(y = i) P(y = j) | {z } b = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 18. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA when Σ = σ2 I Classes are of identical distribution, but different means. Cross-section of classes distribution is spherical. Decision boundary is linear. Called classifier with nearest Euclidean distance to the class mean, when? Σ = 1 0 0 1 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 10 / 22
  • 19. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA when Σ is not identity Classes are of identical distribution, but different means. Cross-section of classes distribution is ellipsoidal. Decision boundary is linear. Σ = 0.5 0 0 1.5 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 11 / 22
  • 20. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA with arbitrary Σ Classes are of identical distribution, but different means. Cross-section of classes distribution is ellipsoidal. Linear decision boundary. Classes are aligned with direction of covariance eigenvectors. Σ = 0.5 0.2 0.2 1.5 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
  • 21. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA with arbitrary Σ Classes are of identical distribution, but different means. Cross-section of classes distribution is ellipsoidal. Linear decision boundary. Classes are aligned with direction of covariance eigenvectors. Σ = 0.5 0.2 0.2 1.5 Called classifier with nearest Mahalanobis distance to the class mean, when? Dist(X, µi) = (X − µi)T Σ−1 (X − µi) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
  • 22. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Another generative learning model; a Bayesian classifier Here, classes are of multinomial distribution, and likelihoods are multivariate Gaussian with separate covariance Σi . Decision boundary becomes non-linear, Why? P(X|y = i) = 1 ( √ 2π)n|Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi )) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 13 / 22
  • 23. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary is parabolic: P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
  • 24. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary is parabolic: P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) 1 (2π) n 2 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 (2π) n 2 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
  • 25. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: 1 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
  • 26. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: 1 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) −1 2 log|Σi | − 1 2 (X − µi )T Σ−1 i (X − µi ) + logP(y = i) = −1 2 log|Σj | − 1 2 (X − µj )T Σ−1 j (X − µj ) + logP(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
  • 27. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: 1 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) −1 2 log|Σi | − 1 2 (X − µi )T Σ−1 i (X − µi ) + logP(y = i) = −1 2 log|Σj | − 1 2 (X − µj )T Σ−1 j (X − µj ) + logP(y = j) ⇒ log P(y = i) P(y = j) − 1 2 log |Σi | |Σj | − 1 2 [XT Σ−1 i X + µT i Σ−1 i µi − 2XT Σ−1 i µi − XT Σ−1 j X − µT j Σ−1 j µj + 2XT Σ−1 j µj ] = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
  • 28. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: log P(y = i) P(y = j) − 1 2 log
  • 29.
  • 30.
  • 31. Σi
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37. Σj
  • 38.
  • 39.
  • 40. − 1 2 [XT (Σ−1 i − Σ−1 j )X + µT i Σ−1 i µi − µT j Σ−1 j µj − 2XT (Σ−1 i µi − Σ−1 j µj )] = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 16 / 22
  • 41. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: log P(y = i) P(y = j) − 1 2 log
  • 42.
  • 43.
  • 44. Σi
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50. Σj
  • 51.
  • 52.
  • 53. − 1 2 [XT (Σ−1 i − Σ−1 j )X + µT i Σ−1 i µi − µT j Σ−1 j µj − 2XT (Σ−1 i µi − Σ−1 j µj )] = 0 ⇒ XT aX + bT X + c = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 16 / 22
  • 54. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Analysis of QDA when Σ = σ2 i I Classes are of different distributions and different means. Cross-sections of classes distribution are spherical but of different sizes. Σ1 = 1.5 0 0 1.5 , Σ2 = 2 0 0 2 , Σ3 = 1 0 0 1 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 17 / 22
  • 55. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Analysis of QDA with arbitrary Σi 6= Σj Classes are of different distributions and different means. Cross-sections of classes distribution are ellipsoidal and of different sizes. Σ1 = 1.5 0.1 0.1 0.5 , Σ2 = 1 −0.2 −0.2 2 , Σ3 = 2 −0.25 −0.25 1.5 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 18 / 22
  • 56. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary GLAD and QDA, Another point of view GLAD and QDA, Another point of view Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 19 / 22
  • 57. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Naive Bayes Naive Bayes: An Example Naive Bayes Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 20 / 22
  • 58. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Naive Bayes Naive Bayes: An Example Naive Bayes: An Example Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 21 / 22
  • 59. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Summary Summary Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 22 / 22