SlideShare une entreprise Scribd logo
1  sur  26
GMMGaussian mixture models
8/15/2014 1
Saurab Dulal
IOE, pulchowk Campus
Introduction to GMM
• Gaussian
“Gaussian is a
characteristic symmetric
"bell curve" shape that
quickly falls off towards 0
(practically)”
• Mixture Model
“mixture model is a
probabilistic model which
assumes the underlying
data to belong to a
mixture distribution”
2
Introduction to GMM
• Mathematical Description of GMM
p(x) = w1 p1 (x) + w2p2 (x) + w3 p3 (x) ……… +wn pn (x)
where p(x) = mixture component
w1, w2 ….. wn = mixture weight or mixture coefficient
pi (x) = Density functions
Fig :- Image
showing
Best fit
Gaussian
Curve
3
Introduction to GMM
“The most common mixture distribution is the Gaussian
(Normal) density function, in which each of the mixture
components are Gaussian distributions, each with their
own mean and variance parameters.”
p(x) = w1N( x | µ1∑1 )+ w1N( x | µ2∑2 )… +w1N( x | µn∑n )
µi ‘s are means and ∑i ‘s are covariance-matrix of
individual components(probability density function)
4
G1,w1 G2,w2
G3,w3
G4,w4
G5,w5
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Component 1 Component 2
p(x)
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Mixture Model
x
p(x)
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Component 1 Component 2
p(x)
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Mixture Model
x
p(x)
-5 0 5 10
0
0.5
1
1.5
2
Component Models
p(x)
-5 0 5 10
0
0.1
0.2
0.3
0.4
0.5
Mixture Model
x
p(x)
GMM for Speaker Recognition
Motivation
• Interpretation that Gaussian component
represent some general speaker –dependent
spectral shapes
• Capabilities of Gaussian mixture to model
arbitrary densities
8
Description of SR-using GMM
• Speech Analysis
• Model Description
• Model Interpretations
• Maximum Likelihood Parameters Estimation
• Speaker Identification
9
Speech Analysis
10
• Linear predictive coding(LPC)
•Mel-scale filter-bank(to reduce
noise)
Analysis is ended with the
generation of
Cepstrum coefficients x1
’, x2
’
x3’….xn
’
A cepstrum is the result of taking the Inverse Fourier transform (IFT)
of the logarithm of the estimated spectrum of a signal.
Cosine transform
2000/05/03 11
Model Description
Gaussian Mixture Density
)()|(
1
xbpxp
M
i
ii



Where x
 D-dimensional random vector








 
)()'(
2
1
exp
)2(
1
)( 1
212 iii
i
Di xxxb 


 iiip  ,,

Mi ,,1 
Nodal, Grand,Global
Nodal, diagonal (this)
Covariance matrix
Mean
Component Density
Speaker Model
Choice of Covariance Matrix
12
• Nodal Covariance
One co-variance matrix per Gaussian component
• Grand Covariance
One co-variance matrix for all Gaussian component
• Global Covariance
single co-variance matrix shared by all speaker
component
Model Interpretation
• Intuitive notion
Acoustic classes(vowels, nasals, fricatives) reflects
some general speaker-dependent vocal tract
configuration that are useful for characterizing speaker-
identity
• GMM have ability to form smooth approximation to
arbitrary shaped density
• It doesn’t only have smooth approx but also
multimodal nature of densities
13
2000/05/03 14
ML-Parameters Estimation
Step:
1. Beginning with an initial model
2. Estimate a new model such that
Mixture density
3. Repeated 2. until certain threshold is
reached.
…Maximum Likelihood
)|()|(  XpXp 
 
2000/05/03 15
(Mixture Weights)
(Means)
(Variances)


T
t
ti xip
T
p
1
),|(
1






 T
t t
T
t tt
i
xip
xxip
1
1
),|(
),|(


 


2
1
1
2
2
),|(
),|(
iT
t t
T
t tt
i
xip
xxip



 






 
 M
k tkk
tii
t
xbp
xbp
xip
1
)(
)(
),|( 


Mixture
Density
Component
Density
and refers to arbitrary elements of vectors ii 

,2
and tx

ii ','2


'tx

and
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
ANEMIA PATIENTS AND CONTROLS
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 1
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 3
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 5
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 10
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 15
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
EM ITERATION 25
0 5 10 15 20 25
400
410
420
430
440
450
460
470
480
490
LOG-LIKELIHOOD AS A FUNCTION OF EM ITERATIONS
EM Iteration
Log-Likelihood
3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
3.7
3.8
3.9
4
4.1
4.2
4.3
4.4
Red Blood Cell Volume
RedBloodCellHemoglobinConcentration
ANEMIA DATA WITH LABELS
Anemia Group
Control Group
2000/05/03 25
Speaker Identification
A group of speakers S = {1,2,…,S} is represented by GMM’s
λ1, λ2, …, λs, the obective is to find the speaker model which
has the maximum a posteriori probability for a given observation
sequence
)(
)Pr()|(
maxarg)|Pr(maxargˆ
11 Xp
Xp
XS kk
Sk
k
Sk




)|(maxargˆ
1
k
Sk
XpS 

 )|(logmaxargˆ
1
1
kt
T
t
Sk
xpS 






T
t
tiikt xbpxp
1
)()|(

which
  logtake
References
D. A. Reynolds and R. C. Rose, “Robust Text- Independent
Speaker Identification Using Gaussian Mixture Speaker
Models”, IEEE Trans. on Speech and Audio Processing, vol.3,
No.1, pp.72-83,January 1995.
• http://en.wikipedia.org/wiki/Probability_density_function
• http://crsouza.blogspot.com/2010/10/gaussian-mixture-
models-and-expectation.html
• https://www.ll.mit.edu/mission/communications/ist/publications
/0802_Reynolds_Biometrics-GMM.pdf
• http://statweb.stanford.edu/~tibs/stat315a/LECTURES/em.pdf
• http://eprints.pascal
network.org/archive/00008291/01/SoftAssignReconstr_ICIP20
11.pdf
• http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/km
eans.html
26

Contenu connexe

Tendances

Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Manohar Mukku
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detectionBrodmann17
 
bag-of-words models
bag-of-words models bag-of-words models
bag-of-words models Xiaotao Zou
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)EdutechLearners
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic RegressionKnoldus Inc.
 
AGE AND GENDER DETECTION.pptx
AGE AND GENDER DETECTION.pptxAGE AND GENDER DETECTION.pptx
AGE AND GENDER DETECTION.pptxssuserb4a9ba
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Appsilon Data Science
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extractionRushin Shah
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural NetworksPyData
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Modelspetitegeek
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...butest
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)Sharayu Patil
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnnKuppusamy P
 

Tendances (20)

Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
bag-of-words models
bag-of-words models bag-of-words models
bag-of-words models
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 
AGE AND GENDER DETECTION.pptx
AGE AND GENDER DETECTION.pptxAGE AND GENDER DETECTION.pptx
AGE AND GENDER DETECTION.pptx
 
Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)Introduction to Generative Adversarial Networks (GANs)
Introduction to Generative Adversarial Networks (GANs)
 
Emotion recognition
Emotion recognitionEmotion recognition
Emotion recognition
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extraction
 
Edge detection
Edge detectionEdge detection
Edge detection
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Transfer Learning and Fine-tuning Deep Neural Networks
 Transfer Learning and Fine-tuning Deep Neural Networks Transfer Learning and Fine-tuning Deep Neural Networks
Transfer Learning and Fine-tuning Deep Neural Networks
 
Expectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture ModelsExpectation Maximization and Gaussian Mixture Models
Expectation Maximization and Gaussian Mixture Models
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
linear classification
linear classificationlinear classification
linear classification
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 

Similaire à Speaker Recognition using Gaussian Mixture Model

An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmIOSR Journals
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...AIST
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithmgarima931
 
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping AlgorithmAdaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithmtheijes
 
Dong Zhang's project
Dong Zhang's projectDong Zhang's project
Dong Zhang's projectDong Zhang
 
O hst-07 design-optimization_nit_agartala
O hst-07 design-optimization_nit_agartalaO hst-07 design-optimization_nit_agartala
O hst-07 design-optimization_nit_agartalaAnand Kumar Chinni
 
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...CSCJournals
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distributionlovemucheca
 
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...hirokazutanaka
 
Roots of equations
Roots of equationsRoots of equations
Roots of equationsMileacre
 
Image compression based on
Image compression based onImage compression based on
Image compression based onijma
 
A Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue ProblemA Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue ProblemScott Donald
 
AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENE...
AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENE...AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENE...
AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENE...Zac Darcy
 

Similaire à Speaker Recognition using Gaussian Mixture Model (20)

An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution Algorithm
 
Poisson distribution jen
Poisson distribution jenPoisson distribution jen
Poisson distribution jen
 
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...Vladimir Milov and  Andrey Savchenko - Classification of Dangerous Situations...
Vladimir Milov and Andrey Savchenko - Classification of Dangerous Situations...
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Report
ReportReport
Report
 
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping AlgorithmAdaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
Adaptive Grouping Quantum Inspired Shuffled Frog Leaping Algorithm
 
Dong Zhang's project
Dong Zhang's projectDong Zhang's project
Dong Zhang's project
 
O hst-07 design-optimization_nit_agartala
O hst-07 design-optimization_nit_agartalaO hst-07 design-optimization_nit_agartala
O hst-07 design-optimization_nit_agartala
 
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
Genetic Algorithm for the Traveling Salesman Problem using Sequential Constru...
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distribution
 
E0212730
E0212730E0212730
E0212730
 
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
Computational Motor Control: Optimal Estimation in Noisy World (JAIST summer ...
 
Lecture 7
Lecture 7Lecture 7
Lecture 7
 
Roots of equations
Roots of equationsRoots of equations
Roots of equations
 
Image compression based on
Image compression based onImage compression based on
Image compression based on
 
Unit3
Unit3Unit3
Unit3
 
ch03.ppt
ch03.pptch03.ppt
ch03.ppt
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
A Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue ProblemA Condensation-Projection Method For The Generalized Eigenvalue Problem
A Condensation-Projection Method For The Generalized Eigenvalue Problem
 
AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENE...
AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENE...AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENE...
AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENE...
 

Dernier

STATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subjectSTATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subjectGayathriM270621
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsResearcher Researcher
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTSneha Padhiar
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书rnrncn29
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfManish Kumar
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxStephen Sitton
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labsamber724300
 
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSneha Padhiar
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxRomil Mishra
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Coursebim.edu.pl
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESkarthi keyan
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosVictor Morales
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdfsahilsajad201
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptNoman khan
 

Dernier (20)

STATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subjectSTATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subject
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending Actuators
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptx
 
Secure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech LabsSecure Key Crypto - Tech Paper JET Tech Labs
Secure Key Crypto - Tech Paper JET Tech Labs
 
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
Stork Webinar | APM Transformational planning, Tool Selection & Performance T...
 
Designing pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptxDesigning pile caps according to ACI 318-19.pptx
Designing pile caps according to ACI 318-19.pptx
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
 
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptxCurve setting (Basic Mine Surveying)_MI10412MI.pptx
Curve setting (Basic Mine Surveying)_MI10412MI.pptx
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Course
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitos
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdf
 
Forming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).pptForming section troubleshooting checklist for improving wire life (1).ppt
Forming section troubleshooting checklist for improving wire life (1).ppt
 

Speaker Recognition using Gaussian Mixture Model

  • 1. GMMGaussian mixture models 8/15/2014 1 Saurab Dulal IOE, pulchowk Campus
  • 2. Introduction to GMM • Gaussian “Gaussian is a characteristic symmetric "bell curve" shape that quickly falls off towards 0 (practically)” • Mixture Model “mixture model is a probabilistic model which assumes the underlying data to belong to a mixture distribution” 2
  • 3. Introduction to GMM • Mathematical Description of GMM p(x) = w1 p1 (x) + w2p2 (x) + w3 p3 (x) ……… +wn pn (x) where p(x) = mixture component w1, w2 ….. wn = mixture weight or mixture coefficient pi (x) = Density functions Fig :- Image showing Best fit Gaussian Curve 3
  • 4. Introduction to GMM “The most common mixture distribution is the Gaussian (Normal) density function, in which each of the mixture components are Gaussian distributions, each with their own mean and variance parameters.” p(x) = w1N( x | µ1∑1 )+ w1N( x | µ2∑2 )… +w1N( x | µn∑n ) µi ‘s are means and ∑i ‘s are covariance-matrix of individual components(probability density function) 4 G1,w1 G2,w2 G3,w3 G4,w4 G5,w5
  • 5. -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Component 1 Component 2 p(x) -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Mixture Model x p(x)
  • 6. -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Component 1 Component 2 p(x) -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Mixture Model x p(x)
  • 7. -5 0 5 10 0 0.5 1 1.5 2 Component Models p(x) -5 0 5 10 0 0.1 0.2 0.3 0.4 0.5 Mixture Model x p(x)
  • 8. GMM for Speaker Recognition Motivation • Interpretation that Gaussian component represent some general speaker –dependent spectral shapes • Capabilities of Gaussian mixture to model arbitrary densities 8
  • 9. Description of SR-using GMM • Speech Analysis • Model Description • Model Interpretations • Maximum Likelihood Parameters Estimation • Speaker Identification 9
  • 10. Speech Analysis 10 • Linear predictive coding(LPC) •Mel-scale filter-bank(to reduce noise) Analysis is ended with the generation of Cepstrum coefficients x1 ’, x2 ’ x3’….xn ’ A cepstrum is the result of taking the Inverse Fourier transform (IFT) of the logarithm of the estimated spectrum of a signal. Cosine transform
  • 11. 2000/05/03 11 Model Description Gaussian Mixture Density )()|( 1 xbpxp M i ii    Where x  D-dimensional random vector           )()'( 2 1 exp )2( 1 )( 1 212 iii i Di xxxb     iiip  ,,  Mi ,,1  Nodal, Grand,Global Nodal, diagonal (this) Covariance matrix Mean Component Density Speaker Model
  • 12. Choice of Covariance Matrix 12 • Nodal Covariance One co-variance matrix per Gaussian component • Grand Covariance One co-variance matrix for all Gaussian component • Global Covariance single co-variance matrix shared by all speaker component
  • 13. Model Interpretation • Intuitive notion Acoustic classes(vowels, nasals, fricatives) reflects some general speaker-dependent vocal tract configuration that are useful for characterizing speaker- identity • GMM have ability to form smooth approximation to arbitrary shaped density • It doesn’t only have smooth approx but also multimodal nature of densities 13
  • 14. 2000/05/03 14 ML-Parameters Estimation Step: 1. Beginning with an initial model 2. Estimate a new model such that Mixture density 3. Repeated 2. until certain threshold is reached. …Maximum Likelihood )|()|(  XpXp   
  • 15. 2000/05/03 15 (Mixture Weights) (Means) (Variances)   T t ti xip T p 1 ),|( 1        T t t T t tt i xip xxip 1 1 ),|( ),|(       2 1 1 2 2 ),|( ),|( iT t t T t tt i xip xxip               M k tkk tii t xbp xbp xip 1 )( )( ),|(    Mixture Density Component Density and refers to arbitrary elements of vectors ii   ,2 and tx  ii ','2   'tx  and
  • 16. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 ANEMIA PATIENTS AND CONTROLS Red Blood Cell Volume RedBloodCellHemoglobinConcentration
  • 17. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 1
  • 18. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 3
  • 19. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 5
  • 20. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 10
  • 21. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 15
  • 22. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration EM ITERATION 25
  • 23. 0 5 10 15 20 25 400 410 420 430 440 450 460 470 480 490 LOG-LIKELIHOOD AS A FUNCTION OF EM ITERATIONS EM Iteration Log-Likelihood
  • 24. 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 Red Blood Cell Volume RedBloodCellHemoglobinConcentration ANEMIA DATA WITH LABELS Anemia Group Control Group
  • 25. 2000/05/03 25 Speaker Identification A group of speakers S = {1,2,…,S} is represented by GMM’s λ1, λ2, …, λs, the obective is to find the speaker model which has the maximum a posteriori probability for a given observation sequence )( )Pr()|( maxarg)|Pr(maxargˆ 11 Xp Xp XS kk Sk k Sk     )|(maxargˆ 1 k Sk XpS    )|(logmaxargˆ 1 1 kt T t Sk xpS        T t tiikt xbpxp 1 )()|(  which   logtake
  • 26. References D. A. Reynolds and R. C. Rose, “Robust Text- Independent Speaker Identification Using Gaussian Mixture Speaker Models”, IEEE Trans. on Speech and Audio Processing, vol.3, No.1, pp.72-83,January 1995. • http://en.wikipedia.org/wiki/Probability_density_function • http://crsouza.blogspot.com/2010/10/gaussian-mixture- models-and-expectation.html • https://www.ll.mit.edu/mission/communications/ist/publications /0802_Reynolds_Biometrics-GMM.pdf • http://statweb.stanford.edu/~tibs/stat315a/LECTURES/em.pdf • http://eprints.pascal network.org/archive/00008291/01/SoftAssignReconstr_ICIP20 11.pdf • http://home.deib.polimi.it/matteucc/Clustering/tutorial_html/km eans.html 26

Notes de l'éditeur

  1. Linear predictive coding (LPC) is a tool used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. It is one of the most powerful speech analysis techniques, and one of the most useful methods for encoding good quality speech at a low bit rate and provides extremely accurate estimates of speech parameters.