SlideShare une entreprise Scribd logo
1  sur  24
"MACHINE LISTENING:
L'INTELLIGENCE
ARTIFICIELLE POUR LES
SONS ET LA MUSIQUE"
COLLOQUE IMT - L’INTELLIGENCE
ARTIFICIELLE AU COEUR DES
MUTATIONS INDUSTRIELLES.
APRIL 4TH, 2019
Gaël RICHARD
Professor, Head of the Image, Data, Signal department
09/04/2019
2
MACHINE LISTENING ?
A well established domain for speech …
Speech Recognition
« What a great workshop !»
Speech Emotion Recognition
« The person who’s speaking is happy but anxious»
Speaker Recognition
« Gaël Richard is speaking»
Speech Translation
« Quel excellent colloque ! »
« What a great workshop ! »
MACHINE LISTENING FOR AUDIO AND MUSIC
Goal: to extract « information » from the audio signal
09/04/2019
3
Music recognition (or Audio ID): identifying the music recording
Music Information retrieval
Music transcription, Music similarity, Music genre recognition,
Musical instrument recognition, Music recommendation, Autotagging,
Audio Scene recognition: sound recorded in streets, in subway station, in office,….)
Audio Source separation
Demixing music, Singing voice extraction, source localisation,
Audio Event recognition: Speech, Music, Car noises, birds, …
Audio Sound capture: Localisation, Dereverberation, Denoising, …
Music Emotion recognition: Sad vs happy vs dance music, ….
Bioacoustics: Wildlife monitoring, biodiversity, species id…
SOME APPLICATIONS …
09/04/2019
4
Music streaming,
music recommendation Music education
Music Identification, Audio Fringerprint Karaoke, speech to rap conversion
Sound recognition,
smarthomes, smart hearables
bmat
Audeering
Stratégie de marque
musicale, Supervision
musicale (pub.; films)
Bioacoustics
Vocal separation, music separation
SOURCE SEPARATION
09/04/2019
5
SOURCE SEPARATION FOR REMIXING
09/04/2019
6
SOURCE SEPARATION
(LEGLAIVE & AL.)
09/04/2019
7
Model for Audio sources
Model for reverberation
AN EXAMPLE OF MODEL FOR AUDIO SOURCES
Non Negative Matrix factorization
09/04/2019
8
SOURCE SEPARATION - S. LEGLAIVE
(EXAMPLE REMIXED BY RADIO FRANCE)
09/04/2019
9
S. Leglaive, R. Badeau, G. Richard, "Multichannel Audio Source Separation with Probabilistic Reverberation
Priors", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 24, no. 12, December 2016
Simon Leglaive, Roland Badeau, Gaël Richard, Separating Time-Frequency Sources from Time-Domain
Convolutive Mixtures Using Non-negative Matrix Factorization. WASPAA , Oct. 2017 New Paltz, US.
AUDIO IDENTIFICATION OU AUDIOID
Audio ID = find high-level metadata from a music recording
Challenges:
Efficiency in adverse conditions (distorsion, noises,..)
Scale to “Big data” (bases > millions of titles)
Rapidity / Real time
Product example :
Audio
identification
Information of the
recording (title, artist,
etc.., …)
REAL-TIME AUDIO IDENTIFICATION
(FENET & AL.)
Audio recordings recognition
• Identical
• Approximate (live vs studio)
• Real time demonstrator
• For music recommendation, second screen applications, …
09/04/2019
11
Sébastien Fenet, Yves Grenier, Gaël Richard: An Extended Audio Fingerprint Method with Capabilities for
Similar Music Detection. ISMIR 2013: 569-574
SOUND EVENTS AND ACOUSTIC SCENE RECOGNITION
09/04/2019
12
SOME APPROACHES
09/04/2019
13
AN EXEMPLE OF A RECENT APPROACH
09/04/2019
14
V. Bisot, R. Serizel, S. Essid, G. Richard, "Feature Learning with Matrix Factorization Applied to Acoustic Scene
Classification", IEEE/ACM Transactions on Audio, Speech, and Language Processing, (2017), Special Issue on
Sound Scene and Event Analysis.
EXAMPLE FOR SCENE CLASSIFICATION
UNSUPERVISED NMF FOR ACOUSTIC SCENE RECOGNITION
UNSUPERVISED NMF FOR ACOUSTIC SCENE RECOGNITION
EXAMPLE WITH DNN: ACOUSTIC SCENE RECOGNITION
V. Bisot & al., "Feature Learning with Matrix Factorization Applied to Acoustic Scene Classification", IEEE/ACM
Transactions on Audio, Speech, and Language Processing, (2017),
V. Bisot & al., Leveraging deep neural networks with nonnegative representations for improved environmental sound
classification IEEE International Workshop on Machine Learning for Signal Processing MLSP, Sep 2017, Tokyo,
TYPICAL PERFORMANCES OF ACOUSTIC SCENE RECOGNITION (CHALLENGE
DCASE 2016)
A Mesaros & al. Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016
challenge IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (2), 379-393
MUSIC INFORMATION RETRIEVAL
Major topics:
• Music transcription (Multiple F0 estimation, Beat/Downbeat detection, instrument classification,
…),
• Music recommendation
• Source separation,
• Multimodal music processing
09/04/2019
20
MIR: AN EXAMPLE WITH DOWNBEAT ESTIMATION
(DURAND & AL. 2017)
MIR: AN EXAMPLE WITH DOWNBEAT ESTIMATION
(DURAND & AL. 2017)
S Durand & al., "Robust Downbeat Tracking Using an Ensemble of Convolutional Networks", IEEE/ACM
Transactions on Audio, Speech, and Language Processing, Vol 25, N°1, 2017
DOWNBEAT ESTIMATION: DÉMO
Examples at the output of each network
https://simondurand.github.io/dnn_audio.html
Other audio example
JBB (Downbeat)
JBB (Tatum)
Exemple (Downbeat)
Exemple (Tatum)
MUSIC INFORMATION RETRIEVAL
Some Current trends:
09/04/2019
24
Context-aware music
Recommendation
with
Text-Informed Lead
Vocal Extraction
with
EEG-driven
music processing
With
(ex Technicolor)
Context-driven
Style transfer
With
(ex Technicolor)
Conditional audio
generation
With
5 new PhD grants within ITN-MIP-Frontiers projects

Contenu connexe

Similaire à Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Machine listening: l'IA pour les sons et la musique.

Archiving and disseminating sound archives – 3. Analysis and treatment of sou...
Archiving and disseminating sound archives – 3. Analysis and treatment of sou...Archiving and disseminating sound archives – 3. Analysis and treatment of sou...
Archiving and disseminating sound archives – 3. Analysis and treatment of sou...Phonothèque MMSH
 
CMNS2130 Sound Production.docx
CMNS2130 Sound Production.docxCMNS2130 Sound Production.docx
CMNS2130 Sound Production.docxwrite31
 
Analysis Synthesis Comparison
Analysis Synthesis ComparisonAnalysis Synthesis Comparison
Analysis Synthesis ComparisonJim Webb
 
A Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using MarsyasA Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using MarsyasLuís Gustavo Martins
 
Towards User-friendly Audio Creation
Towards User-friendly Audio CreationTowards User-friendly Audio Creation
Towards User-friendly Audio CreationJean Vanderdonckt
 
Metadata is Money at Musicbiz 2017
Metadata is Money at Musicbiz 2017Metadata is Money at Musicbiz 2017
Metadata is Money at Musicbiz 2017Kristin Thomson
 
An RSS feed auditory aggregator using earcons
An RSS feed auditory aggregator using earconsAn RSS feed auditory aggregator using earcons
An RSS feed auditory aggregator using earconsAndreas Floros
 
Digital Composition
Digital CompositionDigital Composition
Digital CompositionJosh White
 
Music Objects to Social Machines
Music Objects to Social MachinesMusic Objects to Social Machines
Music Objects to Social MachinesDavid De Roure
 
Metadata for musicians at barcamp philly 2015
Metadata for musicians at barcamp philly 2015Metadata for musicians at barcamp philly 2015
Metadata for musicians at barcamp philly 2015Kristin Thomson
 
Nithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier
 
Audio Processing and Music Recognition
Audio Processing and Music RecognitionAudio Processing and Music Recognition
Audio Processing and Music RecognitionMrinmoy Dalal
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheetluisfvazquez1
 
Music Recommendation and Discovery in the Long Tail
Music Recommendation and Discovery in the Long TailMusic Recommendation and Discovery in the Long Tail
Music Recommendation and Discovery in the Long TailOscar Celma
 
audio info and media.pptx
audio info and media.pptxaudio info and media.pptx
audio info and media.pptxLyka Gumatay
 

Similaire à Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Machine listening: l'IA pour les sons et la musique. (19)

Archiving and disseminating sound archives – 3. Analysis and treatment of sou...
Archiving and disseminating sound archives – 3. Analysis and treatment of sou...Archiving and disseminating sound archives – 3. Analysis and treatment of sou...
Archiving and disseminating sound archives – 3. Analysis and treatment of sou...
 
CMNS2130 Sound Production.docx
CMNS2130 Sound Production.docxCMNS2130 Sound Production.docx
CMNS2130 Sound Production.docx
 
Analysis Synthesis Comparison
Analysis Synthesis ComparisonAnalysis Synthesis Comparison
Analysis Synthesis Comparison
 
A Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using MarsyasA Computational Framework for Sound Segregation in Music Signals using Marsyas
A Computational Framework for Sound Segregation in Music Signals using Marsyas
 
Towards User-friendly Audio Creation
Towards User-friendly Audio CreationTowards User-friendly Audio Creation
Towards User-friendly Audio Creation
 
Ibac2007 riede
Ibac2007 riedeIbac2007 riede
Ibac2007 riede
 
Group 6 Presentation
Group 6 PresentationGroup 6 Presentation
Group 6 Presentation
 
Metadata is Money at Musicbiz 2017
Metadata is Money at Musicbiz 2017Metadata is Money at Musicbiz 2017
Metadata is Money at Musicbiz 2017
 
Lauschangriff
LauschangriffLauschangriff
Lauschangriff
 
An RSS feed auditory aggregator using earcons
An RSS feed auditory aggregator using earconsAn RSS feed auditory aggregator using earcons
An RSS feed auditory aggregator using earcons
 
Digital Composition
Digital CompositionDigital Composition
Digital Composition
 
Music Objects to Social Machines
Music Objects to Social MachinesMusic Objects to Social Machines
Music Objects to Social Machines
 
Metadata for musicians at barcamp philly 2015
Metadata for musicians at barcamp philly 2015Metadata for musicians at barcamp philly 2015
Metadata for musicians at barcamp philly 2015
 
Nithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier research_proposal
Nithin Xavier research_proposal
 
Audio Processing and Music Recognition
Audio Processing and Music RecognitionAudio Processing and Music Recognition
Audio Processing and Music Recognition
 
Ig2 task 1 work sheet
Ig2 task 1 work sheetIg2 task 1 work sheet
Ig2 task 1 work sheet
 
Music Recommendation and Discovery in the Long Tail
Music Recommendation and Discovery in the Long TailMusic Recommendation and Discovery in the Long Tail
Music Recommendation and Discovery in the Long Tail
 
audio info and media.pptx
audio info and media.pptxaudio info and media.pptx
audio info and media.pptx
 
Ism2011
Ism2011Ism2011
Ism2011
 

Plus de I MT

Colloque Healthcare 4.0 : "Soufflez, c’est dépisté"
Colloque Healthcare 4.0 : "Soufflez, c’est dépisté" Colloque Healthcare 4.0 : "Soufflez, c’est dépisté"
Colloque Healthcare 4.0 : "Soufflez, c’est dépisté" I MT
 
Colloque Healthcare 4.0 : "Accompagnement des troubles du sommeil : la recher...
Colloque Healthcare 4.0 : "Accompagnement des troubles du sommeil : la recher...Colloque Healthcare 4.0 : "Accompagnement des troubles du sommeil : la recher...
Colloque Healthcare 4.0 : "Accompagnement des troubles du sommeil : la recher...I MT
 
Colloque Healthcare 4.0 : "HospiT'Win"
Colloque Healthcare 4.0 : "HospiT'Win"Colloque Healthcare 4.0 : "HospiT'Win"
Colloque Healthcare 4.0 : "HospiT'Win"I MT
 
Colloque Healthcare 4.0 : « Vous ne devriez pas autant être sur les écrans ! »
Colloque Healthcare 4.0 : « Vous ne devriez pas autant être sur les écrans ! » Colloque Healthcare 4.0 : « Vous ne devriez pas autant être sur les écrans ! »
Colloque Healthcare 4.0 : « Vous ne devriez pas autant être sur les écrans ! » I MT
 
Colloque Healthcare 4.0 : "NeuroLife : Interfaces Cerveau Machine pour la San...
Colloque Healthcare 4.0 : "NeuroLife : Interfaces Cerveau Machine pour la San...Colloque Healthcare 4.0 : "NeuroLife : Interfaces Cerveau Machine pour la San...
Colloque Healthcare 4.0 : "NeuroLife : Interfaces Cerveau Machine pour la San...I MT
 
Colloque Healthcare 4.0 : "La protection de données en santé"
Colloque Healthcare 4.0 : "La protection de données en santé"Colloque Healthcare 4.0 : "La protection de données en santé"
Colloque Healthcare 4.0 : "La protection de données en santé"I MT
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « EIT Health : un tremplin europ...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « EIT Health : un tremplin europ...Colloque IMT - 15/10/2019 - Healthcare 4.0 – « EIT Health : un tremplin europ...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « EIT Health : un tremplin europ...I MT
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Modélisation semi-analytique ...
Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Modélisation semi-analytique ...Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Modélisation semi-analytique ...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Modélisation semi-analytique ...I MT
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Développement de la caméra XE...
Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Développement de la caméra XE...Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Développement de la caméra XE...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Développement de la caméra XE...I MT
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Le suivi géographique des per...
Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Le suivi géographique des per...Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Le suivi géographique des per...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Le suivi géographique des per...I MT
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Pilotage intelligent du servic...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Pilotage intelligent du servic...Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Pilotage intelligent du servic...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Pilotage intelligent du servic...I MT
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Quantification of idiopathic i...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Quantification of idiopathic i...Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Quantification of idiopathic i...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Quantification of idiopathic i...I MT
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Contrôle...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Contrôle...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Contrôle...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Contrôle...I MT
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Interopé...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Interopé...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Interopé...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Interopé...I MT
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - TeraLab,...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - TeraLab,...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - TeraLab,...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - TeraLab,...I MT
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Session ...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Session ...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Session ...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Session ...I MT
 
Colloque IMT - L'IA au cœur des mutations industrielles - Session Robotique, ...
Colloque IMT - L'IA au cœur des mutations industrielles - Session Robotique, ...Colloque IMT - L'IA au cœur des mutations industrielles - Session Robotique, ...
Colloque IMT - L'IA au cœur des mutations industrielles - Session Robotique, ...I MT
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Focus CE...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Focus CE...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Focus CE...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Focus CE...I MT
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...I MT
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles -PPC dans ...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles -PPC dans ...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles -PPC dans ...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles -PPC dans ...I MT
 

Plus de I MT (20)

Colloque Healthcare 4.0 : "Soufflez, c’est dépisté"
Colloque Healthcare 4.0 : "Soufflez, c’est dépisté" Colloque Healthcare 4.0 : "Soufflez, c’est dépisté"
Colloque Healthcare 4.0 : "Soufflez, c’est dépisté"
 
Colloque Healthcare 4.0 : "Accompagnement des troubles du sommeil : la recher...
Colloque Healthcare 4.0 : "Accompagnement des troubles du sommeil : la recher...Colloque Healthcare 4.0 : "Accompagnement des troubles du sommeil : la recher...
Colloque Healthcare 4.0 : "Accompagnement des troubles du sommeil : la recher...
 
Colloque Healthcare 4.0 : "HospiT'Win"
Colloque Healthcare 4.0 : "HospiT'Win"Colloque Healthcare 4.0 : "HospiT'Win"
Colloque Healthcare 4.0 : "HospiT'Win"
 
Colloque Healthcare 4.0 : « Vous ne devriez pas autant être sur les écrans ! »
Colloque Healthcare 4.0 : « Vous ne devriez pas autant être sur les écrans ! » Colloque Healthcare 4.0 : « Vous ne devriez pas autant être sur les écrans ! »
Colloque Healthcare 4.0 : « Vous ne devriez pas autant être sur les écrans ! »
 
Colloque Healthcare 4.0 : "NeuroLife : Interfaces Cerveau Machine pour la San...
Colloque Healthcare 4.0 : "NeuroLife : Interfaces Cerveau Machine pour la San...Colloque Healthcare 4.0 : "NeuroLife : Interfaces Cerveau Machine pour la San...
Colloque Healthcare 4.0 : "NeuroLife : Interfaces Cerveau Machine pour la San...
 
Colloque Healthcare 4.0 : "La protection de données en santé"
Colloque Healthcare 4.0 : "La protection de données en santé"Colloque Healthcare 4.0 : "La protection de données en santé"
Colloque Healthcare 4.0 : "La protection de données en santé"
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « EIT Health : un tremplin europ...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « EIT Health : un tremplin europ...Colloque IMT - 15/10/2019 - Healthcare 4.0 – « EIT Health : un tremplin europ...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « EIT Health : un tremplin europ...
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Modélisation semi-analytique ...
Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Modélisation semi-analytique ...Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Modélisation semi-analytique ...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Modélisation semi-analytique ...
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Développement de la caméra XE...
Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Développement de la caméra XE...Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Développement de la caméra XE...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Développement de la caméra XE...
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Le suivi géographique des per...
Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Le suivi géographique des per...Colloque IMT - 15/10/2019 - Healthcare 4.0 –  « Le suivi géographique des per...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Le suivi géographique des per...
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Pilotage intelligent du servic...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Pilotage intelligent du servic...Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Pilotage intelligent du servic...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Pilotage intelligent du servic...
 
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Quantification of idiopathic i...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Quantification of idiopathic i...Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Quantification of idiopathic i...
Colloque IMT - 15/10/2019 - Healthcare 4.0 – « Quantification of idiopathic i...
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Contrôle...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Contrôle...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Contrôle...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Contrôle...
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Interopé...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Interopé...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Interopé...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Interopé...
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - TeraLab,...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - TeraLab,...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - TeraLab,...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - TeraLab,...
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Session ...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Session ...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Session ...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Session ...
 
Colloque IMT - L'IA au cœur des mutations industrielles - Session Robotique, ...
Colloque IMT - L'IA au cœur des mutations industrielles - Session Robotique, ...Colloque IMT - L'IA au cœur des mutations industrielles - Session Robotique, ...
Colloque IMT - L'IA au cœur des mutations industrielles - Session Robotique, ...
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Focus CE...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Focus CE...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Focus CE...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Focus CE...
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles -PPC dans ...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles -PPC dans ...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles -PPC dans ...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles -PPC dans ...
 

Dernier

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptNANDHAKUMARA10
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoordharasingh5698
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 

Dernier (20)

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 

Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - Machine listening: l'IA pour les sons et la musique.

  • 1. "MACHINE LISTENING: L'INTELLIGENCE ARTIFICIELLE POUR LES SONS ET LA MUSIQUE" COLLOQUE IMT - L’INTELLIGENCE ARTIFICIELLE AU COEUR DES MUTATIONS INDUSTRIELLES. APRIL 4TH, 2019 Gaël RICHARD Professor, Head of the Image, Data, Signal department
  • 2. 09/04/2019 2 MACHINE LISTENING ? A well established domain for speech … Speech Recognition « What a great workshop !» Speech Emotion Recognition « The person who’s speaking is happy but anxious» Speaker Recognition « Gaël Richard is speaking» Speech Translation « Quel excellent colloque ! » « What a great workshop ! »
  • 3. MACHINE LISTENING FOR AUDIO AND MUSIC Goal: to extract « information » from the audio signal 09/04/2019 3 Music recognition (or Audio ID): identifying the music recording Music Information retrieval Music transcription, Music similarity, Music genre recognition, Musical instrument recognition, Music recommendation, Autotagging, Audio Scene recognition: sound recorded in streets, in subway station, in office,….) Audio Source separation Demixing music, Singing voice extraction, source localisation, Audio Event recognition: Speech, Music, Car noises, birds, … Audio Sound capture: Localisation, Dereverberation, Denoising, … Music Emotion recognition: Sad vs happy vs dance music, …. Bioacoustics: Wildlife monitoring, biodiversity, species id…
  • 4. SOME APPLICATIONS … 09/04/2019 4 Music streaming, music recommendation Music education Music Identification, Audio Fringerprint Karaoke, speech to rap conversion Sound recognition, smarthomes, smart hearables bmat Audeering Stratégie de marque musicale, Supervision musicale (pub.; films) Bioacoustics Vocal separation, music separation
  • 6. SOURCE SEPARATION FOR REMIXING 09/04/2019 6
  • 7. SOURCE SEPARATION (LEGLAIVE & AL.) 09/04/2019 7 Model for Audio sources Model for reverberation
  • 8. AN EXAMPLE OF MODEL FOR AUDIO SOURCES Non Negative Matrix factorization 09/04/2019 8
  • 9. SOURCE SEPARATION - S. LEGLAIVE (EXAMPLE REMIXED BY RADIO FRANCE) 09/04/2019 9 S. Leglaive, R. Badeau, G. Richard, "Multichannel Audio Source Separation with Probabilistic Reverberation Priors", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 24, no. 12, December 2016 Simon Leglaive, Roland Badeau, Gaël Richard, Separating Time-Frequency Sources from Time-Domain Convolutive Mixtures Using Non-negative Matrix Factorization. WASPAA , Oct. 2017 New Paltz, US.
  • 10. AUDIO IDENTIFICATION OU AUDIOID Audio ID = find high-level metadata from a music recording Challenges: Efficiency in adverse conditions (distorsion, noises,..) Scale to “Big data” (bases > millions of titles) Rapidity / Real time Product example : Audio identification Information of the recording (title, artist, etc.., …)
  • 11. REAL-TIME AUDIO IDENTIFICATION (FENET & AL.) Audio recordings recognition • Identical • Approximate (live vs studio) • Real time demonstrator • For music recommendation, second screen applications, … 09/04/2019 11 Sébastien Fenet, Yves Grenier, Gaël Richard: An Extended Audio Fingerprint Method with Capabilities for Similar Music Detection. ISMIR 2013: 569-574
  • 12. SOUND EVENTS AND ACOUSTIC SCENE RECOGNITION 09/04/2019 12
  • 14. AN EXEMPLE OF A RECENT APPROACH 09/04/2019 14 V. Bisot, R. Serizel, S. Essid, G. Richard, "Feature Learning with Matrix Factorization Applied to Acoustic Scene Classification", IEEE/ACM Transactions on Audio, Speech, and Language Processing, (2017), Special Issue on Sound Scene and Event Analysis.
  • 15. EXAMPLE FOR SCENE CLASSIFICATION
  • 16. UNSUPERVISED NMF FOR ACOUSTIC SCENE RECOGNITION
  • 17. UNSUPERVISED NMF FOR ACOUSTIC SCENE RECOGNITION
  • 18. EXAMPLE WITH DNN: ACOUSTIC SCENE RECOGNITION V. Bisot & al., "Feature Learning with Matrix Factorization Applied to Acoustic Scene Classification", IEEE/ACM Transactions on Audio, Speech, and Language Processing, (2017), V. Bisot & al., Leveraging deep neural networks with nonnegative representations for improved environmental sound classification IEEE International Workshop on Machine Learning for Signal Processing MLSP, Sep 2017, Tokyo,
  • 19. TYPICAL PERFORMANCES OF ACOUSTIC SCENE RECOGNITION (CHALLENGE DCASE 2016) A Mesaros & al. Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 challenge IEEE/ACM Transactions on Audio, Speech, and Language Processing 26 (2), 379-393
  • 20. MUSIC INFORMATION RETRIEVAL Major topics: • Music transcription (Multiple F0 estimation, Beat/Downbeat detection, instrument classification, …), • Music recommendation • Source separation, • Multimodal music processing 09/04/2019 20
  • 21. MIR: AN EXAMPLE WITH DOWNBEAT ESTIMATION (DURAND & AL. 2017)
  • 22. MIR: AN EXAMPLE WITH DOWNBEAT ESTIMATION (DURAND & AL. 2017) S Durand & al., "Robust Downbeat Tracking Using an Ensemble of Convolutional Networks", IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol 25, N°1, 2017
  • 23. DOWNBEAT ESTIMATION: DÉMO Examples at the output of each network https://simondurand.github.io/dnn_audio.html Other audio example JBB (Downbeat) JBB (Tatum) Exemple (Downbeat) Exemple (Tatum)
  • 24. MUSIC INFORMATION RETRIEVAL Some Current trends: 09/04/2019 24 Context-aware music Recommendation with Text-Informed Lead Vocal Extraction with EEG-driven music processing With (ex Technicolor) Context-driven Style transfer With (ex Technicolor) Conditional audio generation With 5 new PhD grants within ITN-MIP-Frontiers projects

Notes de l'éditeur

  1. Audio identification consists in retrieving the data linked with an audio excerpt For example, if we are talking about music, you might want to retrieve the name of the song you’re listening to, the artist, and so on... Audio identification must work in spite of the distortions that are induced by the transmission channel. For instance: if you’re acquiring a signal from a radio broadcast : First, the radio station has probably modified the audio signal from the song: they probably changed the equalization (to put more bass for example), changed the stereo (better fit when listening to the song in a car), they might even have time stretched it so that it last longer or shorter (in order to fit in their programs), Second, The signal is transmitted: it is encoded, transmitted through the radio transmitters, decoded, then converted into an analogical signal that you can finally listen Third, You will acquire the signal through a microphone. So it is re-encoded in a numeric version with the characterics of your microphone (which can be quite bad, if you’re capturing with your mobile phone for example). Besides, you might have surrounding noise. So all these steps make your captured signal quite different from your original one. Which means, your audio ID has to take this into account. Another point, you generally want to Identify any song you’re listening to. That implies that your audio ID system must manage a big number of references. For example, in Quaero, we target a reference database which would contain 100.000 songs. Last point, which is quite opposed to the previous one, you must provide an algorithm which is fast... even with large number of references. The reason is simple: most of the applications require real time responses: if you’re delivering a music recognition service to individuals, you don’t want the user to get the name of the song he is listening to one day later, if you’re doing broadcast monitoring (which means, you’re checking every song which is broadcasted on a radio channel / TV channel) you also need real-time processing. So, we have seen Audio Identification, let’s see Audio fingerprinting