SlideShare une entreprise Scribd logo
1  sur  15
Technicolor / INRIA / Imperial College
London at the MediaEval 2012 Violent
       Scene Detection Task

         PENET Cédric – Technicolor, INRIA
        DEMARTY Claire-Hélène – Technicolor
   SOLEYMANI Mohammad – Imperial College London
          GRAVIER Guillaume – CNRS, IRISA
               GROS Patrick – INRIA
         MediaEval 2012 Pisa Workshop
              October, 4th 2012
Outline
   Introduction
   Systems description
   Results and conclusion




2      10/7/2012
Outline
   Introduction
   Systems description
   Results and conclusion




3      10/7/2012
Introduction
Joint effort between Technicolor / INRIA / Imperial College London
   5 runs  5 different systems
       Re-use of last year’s systems with few differences
           Bayesian networks structure learning (Technicolor/INRIA)
           Naive bayesian classifier (ICL)
       Two new systems from Technicolor/INRIA
           Exploiting similarity
           Bag-of-Audio words
       Fusion of three systems (Technicolor/INRIA – ICL)




4       10/7/2012
Outline
   Introduction
   Systems description
   Results and conclusion




5      10/7/2012
Run 1: Exploiting Similarity
   Idea: can we get the same results as last year using only similarity
    measures?
   Video features for each frame
       Motion activity
       Three color harmonisation features: harmonisation template, angle and
        energy
   Decision: KNN using only closest neighbour
       10-movies used to populate KNN
       Test frames labelled according to closest neighbour
       1 frame of a shot labelled violent  shot is labelled violent




6       10/7/2012
Run 2: Bag-of-Audio words
   Audio features extraction
       Extraction of MFCC audio features (with    &   ) - 20ms windows, 10 ms overlap
       Extraction of silence segments with SPro
       Extraction of coherent audio segments – Andre-Obrecht 1988


   K-Means on non-silent audio segments for vocabulary (of size 128)
       Each audio segment replaced by closest centroid


   Construction of TF-IDF histograms
       Each shot is a document


   Classification using SVM
        ² and histogram intersection kernels
       Applied weight on SVM parameter


7        10/7/2012
Run 3: Bayesian Networks structure learning
   Re-use of Technicolor last year’s system with additionnal features
       Audio features: energy, asymmetry, centroid, ZCR, flatness and roll-off at 90%
       Video features: shot length, flashes, blood, activity, color coherence, average
        luminance, fire and color harmonisation features
       Features are averaged over a video shot


   Graphical model for modeling conditional probability distributions along
    with contextual features and temporal smoothing
       Naive Bayesian network (NB)
                                                                        Bayesian network example
       Graph structure learning
           Forest augmented naive Bayesian network (FAN)
           K2


   Late modalities fusion using simple rule
                                                Source: https://controls.engin.umich.edu/wiki/index.php/Bayesian_network_theory




8        10/7/2012
Run 4: Naïve Bayesian classifier
Audio modality
   Classical low level features extracted from non-silent segments
     RMS Energy, pitch, MFCC, ZCR, spectrum flux, Spectral RollOff
   Averaged over shots
Video modality
   Shot duration, luminance, Average activity, motion component
   Averaged over shots
Text features
   Simple features such as number of spoken words and the average valence and arousal
    per shot (from the dictionary of affect in language)
   The results were bad and we decide not to include them in the final submission



A Naïve Bayesian classifier on each modality
   Modality fusion using a weighted sum of posterior probabilities.
       0.95* audio score +0.05 visual score



9        10/7/2012
Run 5: Systems fusion


    Simple fusion of three systems
        Run 2: Bag-of-Audio words
        Run 3: Bayesian networks structure learning
        Run 4: Naive bayesian classifier


    Fusion by multiplication of probabilities




10        10/7/2012
Outline
    Introduction
    Systems description
    Results and conclusions




11      10/7/2012
Results
            Runs                 MAP@100                AP-1       AP-2         AP-3           STD
                                                                                                          MediaEval Cost
     N°     Technique              (%)                   (%)        (%)          (%)           (%)
     1       Similarity             13.89               0.00      12.91 28.77                 14.41            2.29
     2          BoAW                40.54               10.85 52.98 57.77                     25.82            2.50
     3          BN-SL               61.82               60.56 53.15 71.76                      9.37            3.57
     4            NBN               46.27               40.03 22.97 75.82                     26.97            3.64
     5          Fusion              57.47               64.52 37.21 70.69                     17.82            4.60
    Average Precision (AP) for Dead Poet Society (AP-1), Fight Club (AP-2) and Independence Day (AP-3)
    STD: Standard deviation of the three test movies


                                      High variation between movies


               Best results on Independence day (similar to Armageddon)


                                 Needs more movies to compute MAP


12         10/7/2012
Conclusion & perspectives
    Similarity search
        MAP is bad, but MediaEval Cost is one of the best (6th out of 35)
        Adding features and merge decisions from different KNN might improve the
         results


    Fusion
        4th best run overall (out of 35)
        Results not as good as expected
        Improves precision at the cost of recall (false alarms reduced by a factor of
         two)
        Test smarter fusion techniques


    Bayesian Networks – Structure Learning
        3rd best run overall (out of 35)
        Very low standard deviation over three movies
        Bayesian networks for intermediate concepts
13       10/7/2012
Conclusion & perspectives

    Bag-of-Audio words
        MAP is not bad (11th out of 35)
        False alarms and missed detections are pretty low too
        Simple tests proved efficient – more investigation needed


    Naive bayesian classifier
        Simple classifier with audio features can achieve moderatly good results
         (10th out of 35)
        Text features don’t work
        Use a classifier that can learn temporal dynamics




14       10/7/2012
Thanks for your attention !




15   10/7/2012

Contenu connexe

Tendances

Text-Independent Speaker Verification
Text-Independent Speaker VerificationText-Independent Speaker Verification
Text-Independent Speaker VerificationCody Ray
 
Learning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksLearning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksSungminYou
 
Speaker identification using mel frequency
Speaker identification using mel frequency Speaker identification using mel frequency
Speaker identification using mel frequency Phan Duy
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2Jeong Choi
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
 
Teaching Computers to Listen to Music
Teaching Computers to Listen to MusicTeaching Computers to Listen to Music
Teaching Computers to Listen to MusicEric Battenberg
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Gaurav Mittal
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepadeepa4466
 
Kernel analysis of deep networks
Kernel analysis of deep networksKernel analysis of deep networks
Kernel analysis of deep networksBehrang Mehrparvar
 
Automatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approachAutomatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approachAbdullah al Mamun
 
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Keunwoo Choi
 
Text independent speaker recognition system
Text independent speaker recognition systemText independent speaker recognition system
Text independent speaker recognition systemDeepesh Lekhak
 
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...gt_ebuddy
 
A Discrete-Time Polynomial Model of Single Channel Long-Haul Fiber-Optic Comm...
A Discrete-Time Polynomial Model of Single Channel Long-Haul Fiber-Optic Comm...A Discrete-Time Polynomial Model of Single Channel Long-Haul Fiber-Optic Comm...
A Discrete-Time Polynomial Model of Single Channel Long-Haul Fiber-Optic Comm...West Virginia University
 
Text-Independent Speaker Verification Report
Text-Independent Speaker Verification ReportText-Independent Speaker Verification Report
Text-Independent Speaker Verification ReportCody Ray
 
Introducing Deep Learning - Mélanie Ducoffe (UNS-CNRS-I3S)
Introducing Deep Learning - Mélanie Ducoffe (UNS-CNRS-I3S)Introducing Deep Learning - Mélanie Ducoffe (UNS-CNRS-I3S)
Introducing Deep Learning - Mélanie Ducoffe (UNS-CNRS-I3S)TelecomValley
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamWithTheBest
 
Environmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniqueEnvironmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniquePankaj Kumar
 

Tendances (20)

SPEAKER VERIFICATION
SPEAKER VERIFICATIONSPEAKER VERIFICATION
SPEAKER VERIFICATION
 
Text-Independent Speaker Verification
Text-Independent Speaker VerificationText-Independent Speaker Verification
Text-Independent Speaker Verification
 
Learning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networksLearning spatiotemporal features with 3 d convolutional networks
Learning spatiotemporal features with 3 d convolutional networks
 
Speaker identification using mel frequency
Speaker identification using mel frequency Speaker identification using mel frequency
Speaker identification using mel frequency
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
 
Teaching Computers to Listen to Music
Teaching Computers to Listen to MusicTeaching Computers to Listen to Music
Teaching Computers to Listen to Music
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepa
 
Kernel analysis of deep networks
Kernel analysis of deep networksKernel analysis of deep networks
Kernel analysis of deep networks
 
Automatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approachAutomatic Speaker Recognition system using MFCC and VQ approach
Automatic Speaker Recognition system using MFCC and VQ approach
 
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
 
Text independent speaker recognition system
Text independent speaker recognition systemText independent speaker recognition system
Text independent speaker recognition system
 
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
 
A Discrete-Time Polynomial Model of Single Channel Long-Haul Fiber-Optic Comm...
A Discrete-Time Polynomial Model of Single Channel Long-Haul Fiber-Optic Comm...A Discrete-Time Polynomial Model of Single Channel Long-Haul Fiber-Optic Comm...
A Discrete-Time Polynomial Model of Single Channel Long-Haul Fiber-Optic Comm...
 
Text-Independent Speaker Verification Report
Text-Independent Speaker Verification ReportText-Independent Speaker Verification Report
Text-Independent Speaker Verification Report
 
Introducing Deep Learning - Mélanie Ducoffe (UNS-CNRS-I3S)
Introducing Deep Learning - Mélanie Ducoffe (UNS-CNRS-I3S)Introducing Deep Learning - Mélanie Ducoffe (UNS-CNRS-I3S)
Introducing Deep Learning - Mélanie Ducoffe (UNS-CNRS-I3S)
 
wireless trends
wireless trendswireless trends
wireless trends
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
 
Environmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniqueEnvironmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC technique
 

En vedette

Tom weldin llc_retail_opportunities
Tom weldin llc_retail_opportunitiesTom weldin llc_retail_opportunities
Tom weldin llc_retail_opportunitiesTomWeldin
 
TOM WELDIN LLC OBJECTIVES
TOM WELDIN LLC OBJECTIVESTOM WELDIN LLC OBJECTIVES
TOM WELDIN LLC OBJECTIVESTomWeldin
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account MatchingMediaEval2012
 
Tom weldin llc_retail_opportunities
Tom weldin llc_retail_opportunitiesTom weldin llc_retail_opportunities
Tom weldin llc_retail_opportunitiesTomWeldin
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskMediaEval2012
 
Tom Weldin Llc Retail Opportunities
Tom Weldin Llc Retail OpportunitiesTom Weldin Llc Retail Opportunities
Tom Weldin Llc Retail OpportunitiesTomWeldin
 
Ghent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskGhent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskMediaEval2012
 
Tom weldin llc objectives
Tom weldin llc objectivesTom weldin llc objectives
Tom weldin llc objectivesTomWeldin
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskMediaEval2012
 

En vedette (9)

Tom weldin llc_retail_opportunities
Tom weldin llc_retail_opportunitiesTom weldin llc_retail_opportunities
Tom weldin llc_retail_opportunities
 
TOM WELDIN LLC OBJECTIVES
TOM WELDIN LLC OBJECTIVESTOM WELDIN LLC OBJECTIVES
TOM WELDIN LLC OBJECTIVES
 
Brave New Task: User Account Matching
Brave New Task: User Account MatchingBrave New Task: User Account Matching
Brave New Task: User Account Matching
 
Tom weldin llc_retail_opportunities
Tom weldin llc_retail_opportunitiesTom weldin llc_retail_opportunities
Tom weldin llc_retail_opportunities
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
Tom Weldin Llc Retail Opportunities
Tom Weldin Llc Retail OpportunitiesTom Weldin Llc Retail Opportunities
Tom Weldin Llc Retail Opportunities
 
Ghent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing TaskGhent and Cardiff University at the 2012 Placing Task
Ghent and Cardiff University at the 2012 Placing Task
 
Tom weldin llc objectives
Tom weldin llc objectivesTom weldin llc objectives
Tom weldin llc objectives
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
 

Similaire à Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene Detection Task

Mphil Transfer
Mphil TransferMphil Transfer
Mphil Transferspachoud
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...CSCJournals
 
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...Academia Sinica
 
Deep learning-based switchable network for in-loop filtering in high efficie...
Deep learning-based switchable network for in-loop filtering in  high efficie...Deep learning-based switchable network for in-loop filtering in  high efficie...
Deep learning-based switchable network for in-loop filtering in high efficie...IJECEIAES
 
CNNs and Fisher Vectors for No-Audio Multimodal Speech Detection
 CNNs and Fisher Vectors for No-Audio Multimodal Speech Detection CNNs and Fisher Vectors for No-Audio Multimodal Speech Detection
CNNs and Fisher Vectors for No-Audio Multimodal Speech Detectionmultimediaeval
 
OPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingOPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingAlpen-Adria-Universität
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfVignesh V Menon
 
VCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdfVCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdfVignesh V Menon
 
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Alpen-Adria-Universität
 
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...ijtsrd
 
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction NetworkEDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction Networkgerogepatton
 
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction NetworkEDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction Networkgerogepatton
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
Common image compression formats
Common image compression formatsCommon image compression formats
Common image compression formatsClyde Lettsome
 
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...kevig
 
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...kevig
 
01_Introduction.pdf.pdf
01_Introduction.pdf.pdf01_Introduction.pdf.pdf
01_Introduction.pdf.pdfWidedMiled2
 

Similaire à Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene Detection Task (20)

Mphil Transfer
Mphil TransferMphil Transfer
Mphil Transfer
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
 
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
 
Deep learning-based switchable network for in-loop filtering in high efficie...
Deep learning-based switchable network for in-loop filtering in  high efficie...Deep learning-based switchable network for in-loop filtering in  high efficie...
Deep learning-based switchable network for in-loop filtering in high efficie...
 
CNNs and Fisher Vectors for No-Audio Multimodal Speech Detection
 CNNs and Fisher Vectors for No-Audio Multimodal Speech Detection CNNs and Fisher Vectors for No-Audio Multimodal Speech Detection
CNNs and Fisher Vectors for No-Audio Multimodal Speech Detection
 
OPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video StreamingOPTE: Online Per-title Encoding for Live Video Streaming
OPTE: Online Per-title Encoding for Live Video Streaming
 
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdfOPTE: Online Per-title Encoding for Live Video Streaming.pdf
OPTE: Online Per-title Encoding for Live Video Streaming.pdf
 
VCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdfVCIP_MCBE_presentation.pdf
VCIP_MCBE_presentation.pdf
 
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
Energy-Efficient Multi-Codec Bitrate-Ladder Estimation for Adaptive Video Str...
 
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
 
annInstance28Nov6pm
annInstance28Nov6pmannInstance28Nov6pm
annInstance28Nov6pm
 
A04840107
A04840107A04840107
A04840107
 
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction NetworkEDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
 
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction NetworkEDGE-Net: Efficient Deep-learning Gradients Extraction Network
EDGE-Net: Efficient Deep-learning Gradients Extraction Network
 
PPT
PPTPPT
PPT
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Common image compression formats
Common image compression formatsCommon image compression formats
Common image compression formats
 
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
STREAMING PUNCTUATION: A NOVEL PUNCTUATION TECHNIQUE LEVERAGING BIDIRECTIONAL...
 
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
Streaming Punctuation: A Novel Punctuation Technique Leveraging Bidirectional...
 
01_Introduction.pdf.pdf
01_Introduction.pdf.pdf01_Introduction.pdf.pdf
01_Introduction.pdf.pdf
 

Plus de MediaEval2012

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval2012
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding MediaEval2012
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingMediaEval2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsMediaEval2012
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval2012
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval2012
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioMediaEval2012
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskMediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodMediaEval2012
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...MediaEval2012
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...MediaEval2012
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskMediaEval2012
 

Plus de MediaEval2012 (20)

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
Closing
ClosingClosing
Closing
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
MediaEval 2012 Visual Privacy Task: Applying Transform-domain Scrambling to A...
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Tr...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 

Dernier

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 

Dernier (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene Detection Task

  • 1. Technicolor / INRIA / Imperial College London at the MediaEval 2012 Violent Scene Detection Task PENET Cédric – Technicolor, INRIA DEMARTY Claire-Hélène – Technicolor SOLEYMANI Mohammad – Imperial College London GRAVIER Guillaume – CNRS, IRISA GROS Patrick – INRIA MediaEval 2012 Pisa Workshop October, 4th 2012
  • 2. Outline  Introduction  Systems description  Results and conclusion 2 10/7/2012
  • 3. Outline  Introduction  Systems description  Results and conclusion 3 10/7/2012
  • 4. Introduction Joint effort between Technicolor / INRIA / Imperial College London  5 runs  5 different systems  Re-use of last year’s systems with few differences  Bayesian networks structure learning (Technicolor/INRIA)  Naive bayesian classifier (ICL)  Two new systems from Technicolor/INRIA  Exploiting similarity  Bag-of-Audio words  Fusion of three systems (Technicolor/INRIA – ICL) 4 10/7/2012
  • 5. Outline  Introduction  Systems description  Results and conclusion 5 10/7/2012
  • 6. Run 1: Exploiting Similarity  Idea: can we get the same results as last year using only similarity measures?  Video features for each frame  Motion activity  Three color harmonisation features: harmonisation template, angle and energy  Decision: KNN using only closest neighbour  10-movies used to populate KNN  Test frames labelled according to closest neighbour  1 frame of a shot labelled violent  shot is labelled violent 6 10/7/2012
  • 7. Run 2: Bag-of-Audio words  Audio features extraction  Extraction of MFCC audio features (with & ) - 20ms windows, 10 ms overlap  Extraction of silence segments with SPro  Extraction of coherent audio segments – Andre-Obrecht 1988  K-Means on non-silent audio segments for vocabulary (of size 128)  Each audio segment replaced by closest centroid  Construction of TF-IDF histograms  Each shot is a document  Classification using SVM  ² and histogram intersection kernels  Applied weight on SVM parameter 7 10/7/2012
  • 8. Run 3: Bayesian Networks structure learning  Re-use of Technicolor last year’s system with additionnal features  Audio features: energy, asymmetry, centroid, ZCR, flatness and roll-off at 90%  Video features: shot length, flashes, blood, activity, color coherence, average luminance, fire and color harmonisation features  Features are averaged over a video shot  Graphical model for modeling conditional probability distributions along with contextual features and temporal smoothing  Naive Bayesian network (NB) Bayesian network example  Graph structure learning  Forest augmented naive Bayesian network (FAN)  K2  Late modalities fusion using simple rule Source: https://controls.engin.umich.edu/wiki/index.php/Bayesian_network_theory 8 10/7/2012
  • 9. Run 4: Naïve Bayesian classifier Audio modality  Classical low level features extracted from non-silent segments  RMS Energy, pitch, MFCC, ZCR, spectrum flux, Spectral RollOff  Averaged over shots Video modality  Shot duration, luminance, Average activity, motion component  Averaged over shots Text features  Simple features such as number of spoken words and the average valence and arousal per shot (from the dictionary of affect in language)  The results were bad and we decide not to include them in the final submission A Naïve Bayesian classifier on each modality  Modality fusion using a weighted sum of posterior probabilities.  0.95* audio score +0.05 visual score 9 10/7/2012
  • 10. Run 5: Systems fusion  Simple fusion of three systems  Run 2: Bag-of-Audio words  Run 3: Bayesian networks structure learning  Run 4: Naive bayesian classifier  Fusion by multiplication of probabilities 10 10/7/2012
  • 11. Outline  Introduction  Systems description  Results and conclusions 11 10/7/2012
  • 12. Results Runs MAP@100 AP-1 AP-2 AP-3 STD MediaEval Cost N° Technique (%) (%) (%) (%) (%) 1 Similarity 13.89 0.00 12.91 28.77 14.41 2.29 2 BoAW 40.54 10.85 52.98 57.77 25.82 2.50 3 BN-SL 61.82 60.56 53.15 71.76 9.37 3.57 4 NBN 46.27 40.03 22.97 75.82 26.97 3.64 5 Fusion 57.47 64.52 37.21 70.69 17.82 4.60  Average Precision (AP) for Dead Poet Society (AP-1), Fight Club (AP-2) and Independence Day (AP-3)  STD: Standard deviation of the three test movies High variation between movies Best results on Independence day (similar to Armageddon) Needs more movies to compute MAP 12 10/7/2012
  • 13. Conclusion & perspectives  Similarity search  MAP is bad, but MediaEval Cost is one of the best (6th out of 35)  Adding features and merge decisions from different KNN might improve the results  Fusion  4th best run overall (out of 35)  Results not as good as expected  Improves precision at the cost of recall (false alarms reduced by a factor of two)  Test smarter fusion techniques  Bayesian Networks – Structure Learning  3rd best run overall (out of 35)  Very low standard deviation over three movies  Bayesian networks for intermediate concepts 13 10/7/2012
  • 14. Conclusion & perspectives  Bag-of-Audio words  MAP is not bad (11th out of 35)  False alarms and missed detections are pretty low too  Simple tests proved efficient – more investigation needed  Naive bayesian classifier  Simple classifier with audio features can achieve moderatly good results (10th out of 35)  Text features don’t work  Use a classifier that can learn temporal dynamics 14 10/7/2012
  • 15. Thanks for your attention ! 15 10/7/2012