SlideShare une entreprise Scribd logo
1  sur  13
BUT2012
             Brno University of Technology
               Faculty of Information Technology
                       Speech@FIT


Igor Szöke, Michal Fapšo, Karel Veselý




MediaEval 2012 workshop – SWS task, October 4.-5. 2012, Pisa
Outlines
 
     Systems overview & Underlying technologies
 
     PhnRec, R-AKWS, AKWS – primary system
 
     DTW
 
     (GMM/HMM) – not submitted
 
     Calibration
 
     Results and discussion




MediaEval SWS 2012            BUT2012             2
workshop - 4.-5.10. Pisa
System overview
  
           Our internal task was
               −       to build simple and minimalistic language
                       dependent Query-by-Example (QbE).



  
           Ingredients
               −       Development data, Neural net classifier,
                       Phoneme recognizer, Acoustic keyword
                       spotting, DTW, Calibration

MediaEval SWS 2012                    BUT2012                      3
workshop - 4.-5.10. Pisa
System overview




     Sentence mean normalization
                                                                      Bottle-Neck   Posteriors

     Neural network based features
                                                          AKWS              -           X
               −       bottle-necks
                                                          DTW              X            X
               −       three state phone posteriors
                                                          (GMM/HMM)        X            -

     Query detector
               −       AKWS
               −       DTW
               −       (GMM/HMM) – not submitted to the evals
    MediaEval SWS 2012                                BUT2012                                    4
    workshop - 4.-5.10. Pisa
Underlying technologies
 
         Universal context, bottle-neck neural network base classifier
 
         devC state re-alignment, Reduced phone set (50 phonemes)
 
         Trained by Tnet – our tool, publicly available




MediaEval SWS 2012                        BUT2012                        5
workshop - 4.-5.10. Pisa
Phnrec, R-AKWS, AKWS
 
         Phoneme recognizer - free phone loop, devC 66.02% PAC




 
         R-AKWS - Queries extracted from phone alignment
 
         AKWS - Queries extracted from phone recognizer
               devQ - devC                                           devQ - evalC
             MTWV          MTWVcalib   UBTWV                      MTWV    MTWVcalib   UBTWV
R-AKWS       0.739           0.786     0.859             R-AKWS   0.653     0.703     0.789
AKWS         0.452           0.493     0.600             AKWS     0.377     0.429     0.552
MediaEval SWS 2012                             BUT2012                                        6
workshop - 4.-5.10. Pisa
DTW
    
        Used as a baseline.. bottlenecks are better than posteriors




          devQ - devC
        evalQ - evalC
           MTWV MTWVcalib          UBTWV                  MTWV    MTWVcalib   UBTWV
R-AKWS        0.739        0.786   0.859         R-AKWS     -         -         -
AKWS          0.452        0.493   0.600         AKWS     0.470     0.530     0.672
DTW           0.400        0.468   0.552         DTW      0.426     0.488     0.599

MediaEval SWS 2012                         BUT2012                                    7
workshop - 4.-5.10. Pisa
GMM/HMM
   
            Inspired by AKWS, not submitted due to bad results.




                                     MTWV    MTWVcalib    UBTWV
                           R-AKWS    0.739        0.786   0.859
                           AKWS      0.452        0.493   0.600
                           DTW       0.400        0.468   0.552
                           GMM/HMM   0.011          -     0.336
MediaEval SWS 2012                      BUT2012                   8
workshop - 4.-5.10. Pisa
Calibration
   
            TWV - pooled, UBTWV - non-pooled TWV (each term has its best thr.)
   
            Calibration of scores (linear combination of 12 parameters - 6 features
            with linear and quadratic forms). Trained on UBTWV thresholds.
                −          Query length (w/o outer sil), Length of inner sil,
                −          Score average global, Score average by phonemes
                −          Phonemes count, Detections count
   
            We found that Detections count and Length of inner sil work the best for
            AKWS (after evals).
                            Parameter                       Training error AKWS Training error DTW
                            Detections count                0.1272              0.002115
                            Length of inner sil             0.1577              0.002687
                            Query length (w/o outer sil)    0.1626              0.002773
                            Score average global            0.1635              0.002530
                            Phonemes count                  0.1656              0.002779
                            Score average by phonemes       0.1660              0.002746
MediaEval SWS 2012                                         BUT2012                                   9
workshop - 4.-5.10. Pisa
Calibration AKWS




MediaEval SWS 2012               BUT2012      10
workshop - 4.-5.10. Pisa
Conclusion
                       devQ-devC                      evalQ-evalC
             ATWV       MTWV       UBTWV    ATWV        MTWV      UBTWV
     AKWS     0.488      0.502      0.600    0.522       0.553     0.672
             (0.488)    (0.452)             (0.492)     (0.530)
     DTW     0.443       0.468     0.552    0.448       0.488       0.599


• AKWS with new calibration (submitted in brackets)
• Good and consistent data, enough to train good Phnrec
• GMM/HMM does not perform well on in-language condition
  and 1 example per query (our best system in last year)
• Number of detections is important calibration feature (due
  to TWV)
• Future work: detections calibration, system fusion
Like / Dislike / Next evals?
 
     Like:
               −       Adapted TWV, real KWS scoring
               −       Phone alignment provided
               −       Good data, great work of organizers
 
     "Dislike":
               −       No test data alignment
               −       No speaker information
 
     Next evals:
               −       More examples per query?
               −       Provide query and the query sentence (adaptation issue)?
               −       Non-pooled scoring metric?
               −       We would like to share our features – more on poster
                       session
MediaEval SWS 2012                        BUT2012                                 12
workshop - 4.-5.10. Pisa
Thank You for Your attention.




MediaEval SWS 2012         BUT2012          13
workshop - 4.-5.10. Pisa

Contenu connexe

Similaire à BUT2012 APPROACHES FOR SPOKEN WEB SEARCH - MEDIAEVAL 2012

29174_01_VESDA-E_VEP_Overview
29174_01_VESDA-E_VEP_Overview29174_01_VESDA-E_VEP_Overview
29174_01_VESDA-E_VEP_OverviewPutri Verninda
 
Manual quali poc test description rf-scanning v10 6
Manual   quali poc test description rf-scanning v10 6Manual   quali poc test description rf-scanning v10 6
Manual quali poc test description rf-scanning v10 6Victor Alfonso Salas Salazar
 
IBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance AnalysisIBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance Analysisbrettallison
 
Top Ten Ways to Commercialize Microtechnology
Top Ten Ways to Commercialize MicrotechnologyTop Ten Ways to Commercialize Microtechnology
Top Ten Ways to Commercialize Microtechnologyjwylde
 
Ganesh_Mallya_resume_15082016.pdf
Ganesh_Mallya_resume_15082016.pdfGanesh_Mallya_resume_15082016.pdf
Ganesh_Mallya_resume_15082016.pdfGanesh Mallya
 
NEW NanoScope Semiintro ext
NEW NanoScope Semiintro extNEW NanoScope Semiintro ext
NEW NanoScope Semiintro extLloyd Peto
 
Atti della Conferenza GENERAZIONE EOLICA “PULITA”: aspetti meccatronici/azion...
Atti della Conferenza GENERAZIONE EOLICA “PULITA”: aspetti meccatronici/azion...Atti della Conferenza GENERAZIONE EOLICA “PULITA”: aspetti meccatronici/azion...
Atti della Conferenza GENERAZIONE EOLICA “PULITA”: aspetti meccatronici/azion...Centro Produttività Veneto
 
Impact of novel ms ms all acquisition and processing techniques on forensic t...
Impact of novel ms ms all acquisition and processing techniques on forensic t...Impact of novel ms ms all acquisition and processing techniques on forensic t...
Impact of novel ms ms all acquisition and processing techniques on forensic t...SCIEX
 
Oval Internetworking Devices
Oval Internetworking DevicesOval Internetworking Devices
Oval Internetworking Devicesc3i
 
802.11n - Increased Speed Increased Complexity
802.11n - Increased Speed Increased Complexity802.11n - Increased Speed Increased Complexity
802.11n - Increased Speed Increased ComplexitySavvius, Inc
 
Unleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucsUnleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucssolarisyougood
 
Track g test strategy - delta
Track g   test strategy - deltaTrack g   test strategy - delta
Track g test strategy - deltachiportal
 
FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015Muhammad Saleem
 
In-service synchronization monitoring and assurance
In-service synchronization monitoring and assuranceIn-service synchronization monitoring and assurance
In-service synchronization monitoring and assuranceADVA
 
A Transcat.com Webinar Presented by Aglient Technolgoes: Scope Technology Imp...
A Transcat.com Webinar Presented by Aglient Technolgoes: Scope Technology Imp...A Transcat.com Webinar Presented by Aglient Technolgoes: Scope Technology Imp...
A Transcat.com Webinar Presented by Aglient Technolgoes: Scope Technology Imp...Transcat
 
Überleben im OSB/SOA Dschungel Daniel Joray
Überleben im OSB/SOA Dschungel Daniel JorayÜberleben im OSB/SOA Dschungel Daniel Joray
Überleben im OSB/SOA Dschungel Daniel JorayDésirée Pfister
 
IPv6 strategy for deployment at ETH Switzerland
IPv6 strategy for deployment at ETH SwitzerlandIPv6 strategy for deployment at ETH Switzerland
IPv6 strategy for deployment at ETH SwitzerlandSwiss IPv6 Council
 
Panduit EMEA SI Webinar 10
Panduit EMEA SI Webinar 10Panduit EMEA SI Webinar 10
Panduit EMEA SI Webinar 10Panduit
 
Digital Processes with PowerPath Barcodes, Scanning and Digital Imaging
Digital Processes with PowerPath Barcodes, Scanning and Digital ImagingDigital Processes with PowerPath Barcodes, Scanning and Digital Imaging
Digital Processes with PowerPath Barcodes, Scanning and Digital ImagingChris Godin✪
 
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Community
 

Similaire à BUT2012 APPROACHES FOR SPOKEN WEB SEARCH - MEDIAEVAL 2012 (20)

29174_01_VESDA-E_VEP_Overview
29174_01_VESDA-E_VEP_Overview29174_01_VESDA-E_VEP_Overview
29174_01_VESDA-E_VEP_Overview
 
Manual quali poc test description rf-scanning v10 6
Manual   quali poc test description rf-scanning v10 6Manual   quali poc test description rf-scanning v10 6
Manual quali poc test description rf-scanning v10 6
 
IBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance AnalysisIBM SAN Volume Controller Performance Analysis
IBM SAN Volume Controller Performance Analysis
 
Top Ten Ways to Commercialize Microtechnology
Top Ten Ways to Commercialize MicrotechnologyTop Ten Ways to Commercialize Microtechnology
Top Ten Ways to Commercialize Microtechnology
 
Ganesh_Mallya_resume_15082016.pdf
Ganesh_Mallya_resume_15082016.pdfGanesh_Mallya_resume_15082016.pdf
Ganesh_Mallya_resume_15082016.pdf
 
NEW NanoScope Semiintro ext
NEW NanoScope Semiintro extNEW NanoScope Semiintro ext
NEW NanoScope Semiintro ext
 
Atti della Conferenza GENERAZIONE EOLICA “PULITA”: aspetti meccatronici/azion...
Atti della Conferenza GENERAZIONE EOLICA “PULITA”: aspetti meccatronici/azion...Atti della Conferenza GENERAZIONE EOLICA “PULITA”: aspetti meccatronici/azion...
Atti della Conferenza GENERAZIONE EOLICA “PULITA”: aspetti meccatronici/azion...
 
Impact of novel ms ms all acquisition and processing techniques on forensic t...
Impact of novel ms ms all acquisition and processing techniques on forensic t...Impact of novel ms ms all acquisition and processing techniques on forensic t...
Impact of novel ms ms all acquisition and processing techniques on forensic t...
 
Oval Internetworking Devices
Oval Internetworking DevicesOval Internetworking Devices
Oval Internetworking Devices
 
802.11n - Increased Speed Increased Complexity
802.11n - Increased Speed Increased Complexity802.11n - Increased Speed Increased Complexity
802.11n - Increased Speed Increased Complexity
 
Unleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucsUnleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucs
 
Track g test strategy - delta
Track g   test strategy - deltaTrack g   test strategy - delta
Track g test strategy - delta
 
FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015
 
In-service synchronization monitoring and assurance
In-service synchronization monitoring and assuranceIn-service synchronization monitoring and assurance
In-service synchronization monitoring and assurance
 
A Transcat.com Webinar Presented by Aglient Technolgoes: Scope Technology Imp...
A Transcat.com Webinar Presented by Aglient Technolgoes: Scope Technology Imp...A Transcat.com Webinar Presented by Aglient Technolgoes: Scope Technology Imp...
A Transcat.com Webinar Presented by Aglient Technolgoes: Scope Technology Imp...
 
Überleben im OSB/SOA Dschungel Daniel Joray
Überleben im OSB/SOA Dschungel Daniel JorayÜberleben im OSB/SOA Dschungel Daniel Joray
Überleben im OSB/SOA Dschungel Daniel Joray
 
IPv6 strategy for deployment at ETH Switzerland
IPv6 strategy for deployment at ETH SwitzerlandIPv6 strategy for deployment at ETH Switzerland
IPv6 strategy for deployment at ETH Switzerland
 
Panduit EMEA SI Webinar 10
Panduit EMEA SI Webinar 10Panduit EMEA SI Webinar 10
Panduit EMEA SI Webinar 10
 
Digital Processes with PowerPath Barcodes, Scanning and Digital Imaging
Digital Processes with PowerPath Barcodes, Scanning and Digital ImagingDigital Processes with PowerPath Barcodes, Scanning and Digital Imaging
Digital Processes with PowerPath Barcodes, Scanning and Digital Imaging
 
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons Learned
 

Plus de MediaEval2012

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval2012
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding MediaEval2012
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingMediaEval2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account MatchingMediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsMediaEval2012
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskMediaEval2012
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval2012
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioMediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodMediaEval2012
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...MediaEval2012
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...MediaEval2012
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskMediaEval2012
 

Plus de MediaEval2012 (20)

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
Closing
ClosingClosing
Closing
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 

Dernier

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Dernier (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

BUT2012 APPROACHES FOR SPOKEN WEB SEARCH - MEDIAEVAL 2012

  • 1. BUT2012 Brno University of Technology Faculty of Information Technology Speech@FIT Igor Szöke, Michal Fapšo, Karel Veselý MediaEval 2012 workshop – SWS task, October 4.-5. 2012, Pisa
  • 2. Outlines  Systems overview & Underlying technologies  PhnRec, R-AKWS, AKWS – primary system  DTW  (GMM/HMM) – not submitted  Calibration  Results and discussion MediaEval SWS 2012 BUT2012 2 workshop - 4.-5.10. Pisa
  • 3. System overview  Our internal task was − to build simple and minimalistic language dependent Query-by-Example (QbE).  Ingredients − Development data, Neural net classifier, Phoneme recognizer, Acoustic keyword spotting, DTW, Calibration MediaEval SWS 2012 BUT2012 3 workshop - 4.-5.10. Pisa
  • 4. System overview  Sentence mean normalization Bottle-Neck Posteriors  Neural network based features AKWS - X − bottle-necks DTW X X − three state phone posteriors (GMM/HMM) X -  Query detector − AKWS − DTW − (GMM/HMM) – not submitted to the evals MediaEval SWS 2012 BUT2012 4 workshop - 4.-5.10. Pisa
  • 5. Underlying technologies  Universal context, bottle-neck neural network base classifier  devC state re-alignment, Reduced phone set (50 phonemes)  Trained by Tnet – our tool, publicly available MediaEval SWS 2012 BUT2012 5 workshop - 4.-5.10. Pisa
  • 6. Phnrec, R-AKWS, AKWS  Phoneme recognizer - free phone loop, devC 66.02% PAC  R-AKWS - Queries extracted from phone alignment  AKWS - Queries extracted from phone recognizer devQ - devC devQ - evalC MTWV MTWVcalib UBTWV MTWV MTWVcalib UBTWV R-AKWS 0.739 0.786 0.859 R-AKWS 0.653 0.703 0.789 AKWS 0.452 0.493 0.600 AKWS 0.377 0.429 0.552 MediaEval SWS 2012 BUT2012 6 workshop - 4.-5.10. Pisa
  • 7. DTW  Used as a baseline.. bottlenecks are better than posteriors devQ - devC evalQ - evalC MTWV MTWVcalib UBTWV MTWV MTWVcalib UBTWV R-AKWS 0.739 0.786 0.859 R-AKWS - - - AKWS 0.452 0.493 0.600 AKWS 0.470 0.530 0.672 DTW 0.400 0.468 0.552 DTW 0.426 0.488 0.599 MediaEval SWS 2012 BUT2012 7 workshop - 4.-5.10. Pisa
  • 8. GMM/HMM  Inspired by AKWS, not submitted due to bad results. MTWV MTWVcalib UBTWV R-AKWS 0.739 0.786 0.859 AKWS 0.452 0.493 0.600 DTW 0.400 0.468 0.552 GMM/HMM 0.011 - 0.336 MediaEval SWS 2012 BUT2012 8 workshop - 4.-5.10. Pisa
  • 9. Calibration  TWV - pooled, UBTWV - non-pooled TWV (each term has its best thr.)  Calibration of scores (linear combination of 12 parameters - 6 features with linear and quadratic forms). Trained on UBTWV thresholds. − Query length (w/o outer sil), Length of inner sil, − Score average global, Score average by phonemes − Phonemes count, Detections count  We found that Detections count and Length of inner sil work the best for AKWS (after evals). Parameter Training error AKWS Training error DTW Detections count 0.1272 0.002115 Length of inner sil 0.1577 0.002687 Query length (w/o outer sil) 0.1626 0.002773 Score average global 0.1635 0.002530 Phonemes count 0.1656 0.002779 Score average by phonemes 0.1660 0.002746 MediaEval SWS 2012 BUT2012 9 workshop - 4.-5.10. Pisa
  • 10. Calibration AKWS MediaEval SWS 2012 BUT2012 10 workshop - 4.-5.10. Pisa
  • 11. Conclusion devQ-devC evalQ-evalC ATWV MTWV UBTWV ATWV MTWV UBTWV AKWS 0.488 0.502 0.600 0.522 0.553 0.672 (0.488) (0.452) (0.492) (0.530) DTW 0.443 0.468 0.552 0.448 0.488 0.599 • AKWS with new calibration (submitted in brackets) • Good and consistent data, enough to train good Phnrec • GMM/HMM does not perform well on in-language condition and 1 example per query (our best system in last year) • Number of detections is important calibration feature (due to TWV) • Future work: detections calibration, system fusion
  • 12. Like / Dislike / Next evals?  Like: − Adapted TWV, real KWS scoring − Phone alignment provided − Good data, great work of organizers  "Dislike": − No test data alignment − No speaker information  Next evals: − More examples per query? − Provide query and the query sentence (adaptation issue)? − Non-pooled scoring metric? − We would like to share our features – more on poster session MediaEval SWS 2012 BUT2012 12 workshop - 4.-5.10. Pisa
  • 13. Thank You for Your attention. MediaEval SWS 2012 BUT2012 13 workshop - 4.-5.10. Pisa