SlideShare une entreprise Scribd logo
1  sur  50
Télécharger pour lire hors ligne
Deep Within-Class Covariance
Analysis for Robust Deep Audio
Representation Learning
Hamid Eghbal-zadeh 1,2
, Matthias Dorfer 1
, Gerhard Widmer 1,2
1 2
Deep Within-Class Covariance
Analysis for Robust Deep Audio
Representation Learning
Hamid Eghbal-zadeh 1,2
, Matthias Dorfer 1
, Gerhard Widmer 1,2
1 2
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Motivation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
● Convolutional Neural Networks learn useful features and build good
representations
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
● Convolutional Neural Networks learn useful features and build good
representations
● CNNs are also known to generalize on the unseen data
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
● Convolutional Neural Networks learn useful features and build good
representations
● CNNs are also known to generalize on the unseen data
● Many of the benchmark datasets have similar train/test distributions
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
● Convolutional Neural Networks learn useful features and build good
representations
● CNNs are also known to generalize on the unseen data
● Many of the benchmark datasets have similar train/test distributions
● How about a distribution mismatch between training and test?
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Distribution mismatch:
When the distribution of the data in training and validation sets differ from
the test set
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Distribution mismatch:
When the distribution of the data in training and validation sets differ from
the test set
● Speaker Recognition: Training on English, testing on Chinese
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Distribution mismatch:
When the distribution of the data in training and validation sets differ from
the test set
● Speaker Recognition: Training on English, testing on Chinese
● Acoustic Scene Classification: Training on Scenes in one country, testing on
scenes of another country, in another period of time
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Distribution mismatch:
When the distribution of the data in training and validation sets differ from
the test set
● Speaker Recognition: Training on English, testing on Chinese
● Acoustic Scene Classification: Training on Scenes in one country, testing on
scenes of another country, in another period of time
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Performance of end-to-end CNNs (no mismatch vs mismatched):
● We use DCASE2016 (no mismatch) and DCASE2017 (mismatched) datasets1
● Same training and validation, different test set
● Look at several end-to-end CNNs
1) Detection and Classification of Acoustic Scenes and Events, http://dcase.community
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Covariance Analysis of
the representation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Covariance Eigenvalue Analysis:
● We train a VGG network on No mismatch and Mismatched using
spectrograms
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Covariance Eigenvalue Analysis:
● We train a VGG network on No mismatch and Mismatched using
spectrograms
● We analyse the internal representation of the VGG
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Covariance Eigenvalue Analysis:
● We train a VGG network on No mismatch and Mismatched using
spectrograms
● We analyse the internal representation of the VGG
● We use covariance analysis
○ Eigen-values of the covariances matrix
○ Visualisation of the representations projected via PCA
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
Covariance Eigenvalue Analysis:
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
NomismatchVisualisation of the VGG representations:
Train Validation Test
Mismatched
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Within-Class Covariance
Normalisation (WCCN)
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Within-Class Covariance Normalization1,2
:
● Proposed for Speaker Recognition to reduce the false
positive/negatives
1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken
language processing. 2006.
2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and
Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Within-Class Covariance Normalization1,2
:
● Proposed for Speaker Recognition to reduce the false
positive/negatives
● Used to reduce the within-class variability in features such as
GMM supervectors or i-vector features
1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken
language processing. 2006.
2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and
Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Within-Class Covariance Normalization1,2
:
1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken
language processing. 2006.
2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and
Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class
Covariance Analysis
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
● Gradients wrt B are computed and used in backward pass
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
● Gradients wrt B are computed and used in backward pass
● A running average is computed for test time (similar to batchnorm)
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
● Gradients wrt B are computed and used in backward pass
● A running average is computed for test time (similar to batchnorm)
● Compatible with different supervised
tasks (Classification, Detection,
metric learning...) and data (raw audio...)
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
● Gradients wrt B are computed and used in backward pass
● A running average is computed for test time (similar to batchnorm)
● Compatible with different supervised
tasks (Classification, Detection,
metric learning...) and data (raw audio...)
● Can be used with different supervised
losses (CCE, BCE, l2
, ...)
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Results
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
Within-Class Covariance Eigenvalue Analysis (Without DWCCA):
Train Validation Test
Mismatched
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
Within-Class Covariance Eigenvalue Analysis (With DWCCA):
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
Eigenvalue Analysis (With vs without DWCCA):
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
K-NN classification results on VGG representations
Validation Test
Mismatched
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
MismatchedNo mismatch
End-to-end class-wise F1:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
MismatchedNo mismatch
End-to-end class-wise F1:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary:
● We analysed covariance of the representations in a VGG
network
Nomismatch
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary:
● We analysed covariance of the representations in a VGG
network
● We showed that the more mismatch there is between
training and test, the more within-class variability increases
in the representation Nomismatch
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary:
● We analysed covariance of the representations in a VGG
network
● We showed that the more mismatch there is between
training and test, the more within-class variability increases
in the representation
● We proposed Deep Within-class Covariance Analysis, a
deep learning compatible layer capable of significantly
reducing within-class variability of a network’s
representation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary:
● We analysed covariance of the representations in a VGG
network
● We showed that the more mismatch there is between
training and test, the more within-class variability increases
in the representation
● We proposed Deep Within-class Covariance Analysis, a
deep learning compatible layer capable of significantly
reducing within-class variability of a network’s
representation
● We empirically showed that DWCCA improves the
generalisation when the training and test have mismatched
distributions.
Nomismatch
Validation Test
Mismatched
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Thank you for your attention!
Come to the poster for more
discussions.
hamid.eghbal-zadeh@jku.at
heghbalz

Contenu connexe

Similaire à Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning

Similaire à Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning (10)

Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion path
 
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
 
Audio and Vision (D2L9 Insight@DCU Machine Learning Workshop 2017)
Audio and Vision (D2L9 Insight@DCU Machine Learning Workshop 2017)Audio and Vision (D2L9 Insight@DCU Machine Learning Workshop 2017)
Audio and Vision (D2L9 Insight@DCU Machine Learning Workshop 2017)
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detection
 
Audio and Vision (D4L6 2017 UPC Deep Learning for Computer Vision)
Audio and Vision (D4L6 2017 UPC Deep Learning for Computer Vision)Audio and Vision (D4L6 2017 UPC Deep Learning for Computer Vision)
Audio and Vision (D4L6 2017 UPC Deep Learning for Computer Vision)
 
Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Net...
Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Net...Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Net...
Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Net...
 
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
 
EMNLP 2014: Opinion Mining with Deep Recurrent Neural Network
EMNLP 2014: Opinion Mining with Deep Recurrent Neural NetworkEMNLP 2014: Opinion Mining with Deep Recurrent Neural Network
EMNLP 2014: Opinion Mining with Deep Recurrent Neural Network
 
From Semantics to Self-supervised Learning for Speech and Beyond
From Semantics to Self-supervised Learning for Speech and BeyondFrom Semantics to Self-supervised Learning for Speech and Beyond
From Semantics to Self-supervised Learning for Speech and Beyond
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 

Dernier

GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 

Dernier (20)

Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 

Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning

  • 1.
  • 2. Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Hamid Eghbal-zadeh 1,2 , Matthias Dorfer 1 , Gerhard Widmer 1,2 1 2
  • 3. Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Hamid Eghbal-zadeh 1,2 , Matthias Dorfer 1 , Gerhard Widmer 1,2 1 2
  • 4. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Motivation
  • 5. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning ● Convolutional Neural Networks learn useful features and build good representations
  • 6. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning ● Convolutional Neural Networks learn useful features and build good representations ● CNNs are also known to generalize on the unseen data
  • 7. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning ● Convolutional Neural Networks learn useful features and build good representations ● CNNs are also known to generalize on the unseen data ● Many of the benchmark datasets have similar train/test distributions
  • 8. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning ● Convolutional Neural Networks learn useful features and build good representations ● CNNs are also known to generalize on the unseen data ● Many of the benchmark datasets have similar train/test distributions ● How about a distribution mismatch between training and test?
  • 9. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Distribution mismatch: When the distribution of the data in training and validation sets differ from the test set
  • 10. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Distribution mismatch: When the distribution of the data in training and validation sets differ from the test set ● Speaker Recognition: Training on English, testing on Chinese
  • 11. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Distribution mismatch: When the distribution of the data in training and validation sets differ from the test set ● Speaker Recognition: Training on English, testing on Chinese ● Acoustic Scene Classification: Training on Scenes in one country, testing on scenes of another country, in another period of time
  • 12. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Distribution mismatch: When the distribution of the data in training and validation sets differ from the test set ● Speaker Recognition: Training on English, testing on Chinese ● Acoustic Scene Classification: Training on Scenes in one country, testing on scenes of another country, in another period of time
  • 13. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Performance of end-to-end CNNs (no mismatch vs mismatched): ● We use DCASE2016 (no mismatch) and DCASE2017 (mismatched) datasets1 ● Same training and validation, different test set ● Look at several end-to-end CNNs 1) Detection and Classification of Acoustic Scenes and Events, http://dcase.community
  • 14. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Covariance Analysis of the representation
  • 15. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Covariance Eigenvalue Analysis: ● We train a VGG network on No mismatch and Mismatched using spectrograms
  • 16. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Covariance Eigenvalue Analysis: ● We train a VGG network on No mismatch and Mismatched using spectrograms ● We analyse the internal representation of the VGG
  • 17. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Covariance Eigenvalue Analysis: ● We train a VGG network on No mismatch and Mismatched using spectrograms ● We analyse the internal representation of the VGG ● We use covariance analysis ○ Eigen-values of the covariances matrix ○ Visualisation of the representations projected via PCA
  • 18. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch Covariance Eigenvalue Analysis: Train Test Mismatched Validation
  • 19. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning NomismatchVisualisation of the VGG representations: Train Validation Test Mismatched
  • 20. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Within-Class Covariance Normalisation (WCCN)
  • 21. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Within-Class Covariance Normalization1,2 : ● Proposed for Speaker Recognition to reduce the false positive/negatives 1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken language processing. 2006. 2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
  • 22. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Within-Class Covariance Normalization1,2 : ● Proposed for Speaker Recognition to reduce the false positive/negatives ● Used to reduce the within-class variability in features such as GMM supervectors or i-vector features 1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken language processing. 2006. 2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
  • 23. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Within-Class Covariance Normalization1,2 : 1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken language processing. 2006. 2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
  • 24. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis
  • 25. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN
  • 26. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches
  • 27. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability
  • 28. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass
  • 29. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass ● Gradients wrt B are computed and used in backward pass
  • 30. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass ● Gradients wrt B are computed and used in backward pass ● A running average is computed for test time (similar to batchnorm)
  • 31. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass ● Gradients wrt B are computed and used in backward pass ● A running average is computed for test time (similar to batchnorm) ● Compatible with different supervised tasks (Classification, Detection, metric learning...) and data (raw audio...)
  • 32. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass ● Gradients wrt B are computed and used in backward pass ● A running average is computed for test time (similar to batchnorm) ● Compatible with different supervised tasks (Classification, Detection, metric learning...) and data (raw audio...) ● Can be used with different supervised losses (CCE, BCE, l2 , ...)
  • 33. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Results
  • 34. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch Within-Class Covariance Eigenvalue Analysis (Without DWCCA): Train Validation Test Mismatched
  • 35. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch Within-Class Covariance Eigenvalue Analysis (With DWCCA): Train Test Mismatched Validation
  • 36. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch Eigenvalue Analysis (With vs without DWCCA): Train Test Mismatched Validation
  • 37. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch K-NN classification results on VGG representations Validation Test Mismatched
  • 38. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 39. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 40. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 41. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 42. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 43. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning MismatchedNo mismatch End-to-end class-wise F1:
  • 44. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning MismatchedNo mismatch End-to-end class-wise F1:
  • 45. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary
  • 46. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary: ● We analysed covariance of the representations in a VGG network Nomismatch Train Test Mismatched Validation
  • 47. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary: ● We analysed covariance of the representations in a VGG network ● We showed that the more mismatch there is between training and test, the more within-class variability increases in the representation Nomismatch Train Test Mismatched Validation
  • 48. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary: ● We analysed covariance of the representations in a VGG network ● We showed that the more mismatch there is between training and test, the more within-class variability increases in the representation ● We proposed Deep Within-class Covariance Analysis, a deep learning compatible layer capable of significantly reducing within-class variability of a network’s representation
  • 49. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary: ● We analysed covariance of the representations in a VGG network ● We showed that the more mismatch there is between training and test, the more within-class variability increases in the representation ● We proposed Deep Within-class Covariance Analysis, a deep learning compatible layer capable of significantly reducing within-class variability of a network’s representation ● We empirically showed that DWCCA improves the generalisation when the training and test have mismatched distributions. Nomismatch Validation Test Mismatched
  • 50. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Thank you for your attention! Come to the poster for more discussions. hamid.eghbal-zadeh@jku.at heghbalz