SlideShare une entreprise Scribd logo
1  sur  53
From Unsupervised to
Semi-Supervised Event Detection
Wen-Sheng Chu
Robotics Institute, Carnegie Mellon University
July 9, 2013
1
Jeffery CohnFernando De la Torre
Outline
1. Unsupervised Temporal Commonality
Discovery
(Chu et al, ECCV’12)
2. Personalized Facial Action Unit Detection
(Chu et al, CVPR’13)
2
Unsupervised Commonality Discovery
in Images
Where are the repeated patterns?
3
(Chu’10, Mukherjee’11, Collins’12)
Unsupervised Commonality Discovery
in Videos?
• We name it Temporal Commonality Discovery (TCD).
• Goal: Given two videos, discover common events in
an unsupervised fashion. 4
TCD is hard!
1) No prior knowledge on commonalities
– We do not know what, where and how many
commonalities exist in the video
2) Exhaustive search are computationally prohibitive
– E.g., two videos with 300 frames have >8,000,000,000
possible matches.
possible locations possible lengths
possibilities/sequence
Another possibilities/sequence
5
Formulation
6
Integer programming!
Optimization: Interpretation
7
Optimization: Native Search
Complexity 8
Optimization: Branch-and-Bound
• Similar to the idea of ESS (Lampert’08), we search the
space by splitting intervals.
9
Optimization: Branch-and-Bound
• Bounding histogram bins
10
1. Bounding L1 distance:
2. Intersection similarity:
3. X2 distance:
Optimization: Branch-and-Bound
11
Unlikely
search
regions
(B1,E1,B2,E2; -10)
Searching Structure
(B1,E1,B2,E2; 32)
Priority queue
(sorted by bound scores)
…
(B1,E1,B2,E2; -50)
(B1,E1,B2,E2; -105)
State S = (Rectangle set; score)
12
(B1,E1,B2,E2; -105)
Algorithm
(B1,E1,B2,E2; 32)
Priority queue
(sorted by bound scores)
…
(B1,E1,B2,E2; -50)
(B1,E1,B2,E2; -105)
Top state
1. Pop out
the top state
2. Split
13
(B1,E1,B2,E2; -105)
Algorithm
(B1,E1,B2,E2; 32)
Priority queue
(sorted by bound scores)
…
(B1,E1,B2,E2; -50)
Top state
(B1,E’1,B2,E2; -76)
(B1,E’’1,B2,E2; -61)
3. Compute
bounding scores
4. Push back the
split states
14
Algorithm
(B1,E1,B2,E2; 32)
Priority queue
(sorted by bound scores)
…
(B1,E1,B2,E2; -50)
Top state
(B1,E’1,B2,E2; -76)
(B1,E’’1,B2,E2; -61)
• The algorithm stop when
the top state contains an
unique rectangle.
Omit most of the
search space with
large distances
15
Compare with Relevant Work
1. Difference between TCD and ESS
[1]/STBB[2]
– Different learning framework:
• Unsupervised v.s. Supervised
– New bounding functions for TCD
2. Difference between TCD and [3]
– Different objective:
• Commonality Discovery v.s. Temporal Clustering
[1] “Efficient subwindow search: A branch and bound framework for object
localization”, PAMI 2009.
[2] “Discriminative video pattern search for efficient action detection”, PAMI 2011. 16
Experiment (1): Synthesized Sequence
Histograms of the discovered
pair of subsequences
17
Experiment (2):
Discover Common Facial Actions
• RU-FACS dataset*
– Interview videos with 29 subjects
– 5000~8000 frames/video
– Collect 100 segments that containing smiley mouths (AU-
12)
– Evaluate in terms of averaged precision
18
* “Automatic recognition of facial actions in spontaneous expressions”, Journal of
Multimedia 2006.
Experiment (2):
Discover Common Facial Actions
19
• Parametric settings for Sliding Windows (SW)
• Log of #evaluations:
• Quality of discovered patterns:
• a
Experiment (2): Speed Evaluation
Speed #evaluation of the distance function´
log
nT C D
nSW i
d(r SW i
) ¡ d(r T C D
)
20
Experiment (2):
Discover Common Facial Actions
• Compare with LCCS* on -distance
21
* “Frame-level temporal calibration of unsynchronized cameras by using Longest
Consecutive Common Subsequence”, ICASSP 2009.
Experiment (3): Discover
Multiple Common Human Motions
• CMU-Mocap dataset:
– http://mocap.cs.cmu.edu/
• 15 sequences from Subject 86
• 1200~2600 frames and up to 10 actions/seq
• Exclude the comparison with SW because it
needs >1012 evaluations
22
Experiment (3): Discover
Multiple Common Human Motions
23
Experiment (3): Discover
Multiple Common Human Motions
• Compare with LCCS* on -distance
24
Extension: Video Indexing
• Goal: Given a query , find the best common
subsequence in the target video
• A straightforward extension:
Temporal
Search
Space
25
A Prototype for Video Indexing
26
Summary
27
Questions?
[1+ “Common Visual Pattern Discovery via Spatially Coherent
Correspondences,” In CVPR 2010.
[2+ “MOMI-cosegmentation: simultaneous segmentation of multiple objects
among multiple images,” In ACCV 2010.
[3+ “Scale invariant cosegmentation for image groups,” In CVPR 2011.
[4+ “Random walks based multi-image segmentation: Quasiconvexity results
and GPU-based solutions,” In CVPR 2012.
[5+ “Frame-level temporal calibration of unsynchronized cameras by using
Longest Consecutive Common Subsequence,” In ICASSP 2009.
[6+ “Efficient ESS with submodular score functions,” In CVPR 2011.
28
http://humansensing.cs.cmu.edu/wschu/
Outline
1. Unsupervised Temporal Commonality
Discovery
(Chu et al, ECCV’12)
2. Selective Transfer Machine for Personalized
Facial Action Unit Detection
(Chu et al, CVPR’13)
29
AU 6+12
Facial Action Units (AU)
30
Main Idea
31
Related Work: Features
32
Related Work: Classifiers
33
Feature Bias
Person specific!
34
Occurrence Bias
35
Selective Transfer Machine (STM)
Formulation
Maximizes margin of penalized SVM
Minimize distribution mismatch
36
Goal (1): Maximize penalized SVM margin
margin
penalized loss
37
Goal (2): Minimize Distribution Mismatch
• Kernel Mean Matching (KMM)*
38
* “Covariate shift by kernel mean matching”, Dataset shift in machine learning, 2009.
Goal (2): Minimize Distribution Mismatch
Groundtruth
Bad estimator
for testing data!
39
Better fitting!
Groundtruth
Selection by reweighting
training data
40
Goal (2): Minimize Distribution Mismatch
41
42
Optimization: Alternate Convex Search
43
Optimization: Alternative Convex Search
Compare with Relevant Work
44
[1] "Covariate shift by kernel mean matching," Dataset shift in machine
learning, 2009.
[2] "Transductive inference for text classification using support vector
machines," In ICML 1999.
[3] "Domain adaptation problems: A DASVM classification technique and a
circular validation strategy," PAMI 2010.
Experiments
• Features
– SIFT descriptors on 49 facial landmarks
– Preserve 98% energy using PCA
45
Datasets #Subjects #Videos #Frm/vid Content
CK+ 123 593 ~20 NeutralPeak
GEMEP-FERA 7 87 20~60 Acting
RU-FACS 29 29 5000~7500 Interview
Experiment (1): Synthetic Data
46
• Two protocols
– PS1: train/test are separate data of the same subject
– PS2: training subjects include test subject (same protocol in [2])
• GEMEP-FERA
Experiment (2): Comparison with Person-
specific (PS) Classifiers
47
Experiment (2): Selection Ability of STM
48
• 123 subjects, 597 videos, ~20 frames/video
Experiment (3): CK+
49
Experiment (4): GEMEP-FERA
50
• 7 subjects, 87 videos, 20~60 frames/video
• 29 subjects, 29 videos, 5000~7000 frames/vid
Experiment (5): RU-FACS
51
Summary
• Person-specific biases exist among face-
related problems, esp. facial expression
• We propose to alleviate the biases by
personalizing classifiers using STM
• Next
– Joint optimization in terms of
– Reduce the memory cost using SMO
– Explore more potential biases in face problems,
e.g., occurrence bias
52
Questions?
[1] "Covariate shift by kernel mean matching," Dataset shift in machine
learning, 2009.
[2] "Transductive inference for text classification using support vector
machines," In ICML 1999.
[3] "Domain adaptation problems: A DASVM classification technique and a
circular validation strategy," PAMI 2010.
*4+ “Integrating structured biological data by kernel maximum mean
discrepancy”, Bioinformatics 2006.
*5+ “Meta-analysis of the first facial expression recognition challenge,” IEEE
Trans. on Systems, Man, and Cybernetics, Part B, 2012.
53
http://humansensing.cs.cmu.edu/wschu/

Contenu connexe

Tendances

A Novel Watermarking Scheme for Image Authentication in Social Networks
A Novel Watermarking Scheme for Image Authentication in Social NetworksA Novel Watermarking Scheme for Image Authentication in Social Networks
A Novel Watermarking Scheme for Image Authentication in Social NetworksOresti Banos
 
Bayesian network based software reliability prediction
Bayesian network based software reliability predictionBayesian network based software reliability prediction
Bayesian network based software reliability predictionJULIO GONZALEZ SANZ
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksVincenzo Lomonaco
 
26.motion and feature based person tracking
26.motion and feature based person tracking26.motion and feature based person tracking
26.motion and feature based person trackingsajit1975
 
PRACTICAL APPROACHES TO TARGET DETECTION IN LONG RANGE AND LOW QUALITY INFRAR...
PRACTICAL APPROACHES TO TARGET DETECTION IN LONG RANGE AND LOW QUALITY INFRAR...PRACTICAL APPROACHES TO TARGET DETECTION IN LONG RANGE AND LOW QUALITY INFRAR...
PRACTICAL APPROACHES TO TARGET DETECTION IN LONG RANGE AND LOW QUALITY INFRAR...sipij
 
Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...
Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...
Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...InVID Project
 
0 eeg based 3 d visual fatigue evaluation using cnn
0 eeg based 3 d visual fatigue evaluation using cnn0 eeg based 3 d visual fatigue evaluation using cnn
0 eeg based 3 d visual fatigue evaluation using cnnHoopeer Hoopeer
 
Depth preserving warping for stereo image retargeting
Depth preserving warping for stereo image retargetingDepth preserving warping for stereo image retargeting
Depth preserving warping for stereo image retargetingLogicMindtech Nologies
 
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detectionTaleb ALASHKAR
 
Literature Survey on Image Deblurring Techniques
Literature Survey on Image Deblurring TechniquesLiterature Survey on Image Deblurring Techniques
Literature Survey on Image Deblurring TechniquesEditor IJCATR
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1San Kim
 
EXPLOITING REFERENCE IMAGES IN EXPOSING GEOMETRICAL DISTORTIONS
EXPLOITING REFERENCE IMAGES IN EXPOSING GEOMETRICAL DISTORTIONSEXPLOITING REFERENCE IMAGES IN EXPOSING GEOMETRICAL DISTORTIONS
EXPLOITING REFERENCE IMAGES IN EXPOSING GEOMETRICAL DISTORTIONSijma
 
Insertion of Impairments in Test Video Sequences for Quality Assessment Based...
Insertion of Impairments in Test Video Sequences for Quality Assessment Based...Insertion of Impairments in Test Video Sequences for Quality Assessment Based...
Insertion of Impairments in Test Video Sequences for Quality Assessment Based...Universidad Politécnica de Madrid
 
PSO optimized Feed Forward Neural Network for offline Signature Classification
PSO optimized Feed Forward Neural Network for offline Signature ClassificationPSO optimized Feed Forward Neural Network for offline Signature Classification
PSO optimized Feed Forward Neural Network for offline Signature ClassificationIJERA Editor
 
Convolutional Neural Network Architecture and Input Volume Matrix Design for ...
Convolutional Neural Network Architecture and Input Volume Matrix Design for ...Convolutional Neural Network Architecture and Input Volume Matrix Design for ...
Convolutional Neural Network Architecture and Input Volume Matrix Design for ...Takumi Kodama
 
Full Body Spatial Vibrotactile Brain Computer Interface Paradigm
Full Body Spatial Vibrotactile Brain Computer Interface ParadigmFull Body Spatial Vibrotactile Brain Computer Interface Paradigm
Full Body Spatial Vibrotactile Brain Computer Interface ParadigmTakumi Kodama
 
Land Boundary Detection of an Island using improved Morphological Operation
Land Boundary Detection of an Island using improved Morphological OperationLand Boundary Detection of an Island using improved Morphological Operation
Land Boundary Detection of an Island using improved Morphological OperationCSCJournals
 
Efficient exploration of region hierarchies for semantic segmentation
Efficient exploration of region hierarchies for semantic segmentationEfficient exploration of region hierarchies for semantic segmentation
Efficient exploration of region hierarchies for semantic segmentationUniversitat Politècnica de Catalunya
 

Tendances (18)

A Novel Watermarking Scheme for Image Authentication in Social Networks
A Novel Watermarking Scheme for Image Authentication in Social NetworksA Novel Watermarking Scheme for Image Authentication in Social Networks
A Novel Watermarking Scheme for Image Authentication in Social Networks
 
Bayesian network based software reliability prediction
Bayesian network based software reliability predictionBayesian network based software reliability prediction
Bayesian network based software reliability prediction
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural Networks
 
26.motion and feature based person tracking
26.motion and feature based person tracking26.motion and feature based person tracking
26.motion and feature based person tracking
 
PRACTICAL APPROACHES TO TARGET DETECTION IN LONG RANGE AND LOW QUALITY INFRAR...
PRACTICAL APPROACHES TO TARGET DETECTION IN LONG RANGE AND LOW QUALITY INFRAR...PRACTICAL APPROACHES TO TARGET DETECTION IN LONG RANGE AND LOW QUALITY INFRAR...
PRACTICAL APPROACHES TO TARGET DETECTION IN LONG RANGE AND LOW QUALITY INFRAR...
 
Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...
Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...
Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neu...
 
0 eeg based 3 d visual fatigue evaluation using cnn
0 eeg based 3 d visual fatigue evaluation using cnn0 eeg based 3 d visual fatigue evaluation using cnn
0 eeg based 3 d visual fatigue evaluation using cnn
 
Depth preserving warping for stereo image retargeting
Depth preserving warping for stereo image retargetingDepth preserving warping for stereo image retargeting
Depth preserving warping for stereo image retargeting
 
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
3D Dynamic Facial Sequences Analsysis for face recognition and emotion detection
 
Literature Survey on Image Deblurring Techniques
Literature Survey on Image Deblurring TechniquesLiterature Survey on Image Deblurring Techniques
Literature Survey on Image Deblurring Techniques
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1
 
EXPLOITING REFERENCE IMAGES IN EXPOSING GEOMETRICAL DISTORTIONS
EXPLOITING REFERENCE IMAGES IN EXPOSING GEOMETRICAL DISTORTIONSEXPLOITING REFERENCE IMAGES IN EXPOSING GEOMETRICAL DISTORTIONS
EXPLOITING REFERENCE IMAGES IN EXPOSING GEOMETRICAL DISTORTIONS
 
Insertion of Impairments in Test Video Sequences for Quality Assessment Based...
Insertion of Impairments in Test Video Sequences for Quality Assessment Based...Insertion of Impairments in Test Video Sequences for Quality Assessment Based...
Insertion of Impairments in Test Video Sequences for Quality Assessment Based...
 
PSO optimized Feed Forward Neural Network for offline Signature Classification
PSO optimized Feed Forward Neural Network for offline Signature ClassificationPSO optimized Feed Forward Neural Network for offline Signature Classification
PSO optimized Feed Forward Neural Network for offline Signature Classification
 
Convolutional Neural Network Architecture and Input Volume Matrix Design for ...
Convolutional Neural Network Architecture and Input Volume Matrix Design for ...Convolutional Neural Network Architecture and Input Volume Matrix Design for ...
Convolutional Neural Network Architecture and Input Volume Matrix Design for ...
 
Full Body Spatial Vibrotactile Brain Computer Interface Paradigm
Full Body Spatial Vibrotactile Brain Computer Interface ParadigmFull Body Spatial Vibrotactile Brain Computer Interface Paradigm
Full Body Spatial Vibrotactile Brain Computer Interface Paradigm
 
Land Boundary Detection of an Island using improved Morphological Operation
Land Boundary Detection of an Island using improved Morphological OperationLand Boundary Detection of an Island using improved Morphological Operation
Land Boundary Detection of an Island using improved Morphological Operation
 
Efficient exploration of region hierarchies for semantic segmentation
Efficient exploration of region hierarchies for semantic segmentationEfficient exploration of region hierarchies for semantic segmentation
Efficient exploration of region hierarchies for semantic segmentation
 

Similaire à From Unsupervised to Semi-Supervised Event Detection

Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Saimunur Rahman
 
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene GraphsAction Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene GraphsSangmin Woo
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...CSCJournals
 
TVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using TitlesTVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using TitlesNEERAJ BAGHEL
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?klschoef
 
Elderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detectionElderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detectionTanvi Mittal
 
Video Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonVideo Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonCSCJournals
 
Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...LinkedTV
 
Forever Young: A Tribute to the Grandmaster through a recount of Personal Jou...
Forever Young: A Tribute to the Grandmaster through a recount of Personal Jou...Forever Young: A Tribute to the Grandmaster through a recount of Personal Jou...
Forever Young: A Tribute to the Grandmaster through a recount of Personal Jou...Goergen Institute for Data Science
 
Neural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsNeural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsJaey Jeong
 
Campus_Abnormal_Behavior_Recognition_With_Temporal_Segment_Transformers.docx
Campus_Abnormal_Behavior_Recognition_With_Temporal_Segment_Transformers.docxCampus_Abnormal_Behavior_Recognition_With_Temporal_Segment_Transformers.docx
Campus_Abnormal_Behavior_Recognition_With_Temporal_Segment_Transformers.docxShakas Technologies
 
A Comparison of People Counting Techniques via Video Scene Analysis
A Comparison of People Counting Techniques viaVideo Scene AnalysisA Comparison of People Counting Techniques viaVideo Scene Analysis
A Comparison of People Counting Techniques via Video Scene AnalysisPoo Kuan Hoong
 
A novel sketch based face recognition in unconstrained video for criminal inv...
A novel sketch based face recognition in unconstrained video for criminal inv...A novel sketch based face recognition in unconstrained video for criminal inv...
A novel sketch based face recognition in unconstrained video for criminal inv...IJECEIAES
 
Fast Human Detection in Surveillance Video
Fast Human Detection in Surveillance VideoFast Human Detection in Surveillance Video
Fast Human Detection in Surveillance VideoIOSR Journals
 
Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...MediaMixerCommunity
 
Optimal Repeated Frame Compensation Using Efficient Video Coding
Optimal Repeated Frame Compensation Using Efficient Video  CodingOptimal Repeated Frame Compensation Using Efficient Video  Coding
Optimal Repeated Frame Compensation Using Efficient Video CodingIOSR Journals
 
The effects of visual realism on search tasks in mixed reality simulations-IE...
The effects of visual realism on search tasks in mixed reality simulations-IE...The effects of visual realism on search tasks in mixed reality simulations-IE...
The effects of visual realism on search tasks in mixed reality simulations-IE...Yadhu Kiran
 
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...Universitat Politècnica de Catalunya
 

Similaire à From Unsupervised to Semi-Supervised Event Detection (20)

Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
 
Visual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic VideosVisual Search for Musical Performances and Endoscopic Videos
Visual Search for Musical Performances and Endoscopic Videos
 
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene GraphsAction Genome: Action As Composition of Spatio Temporal Scene Graphs
Action Genome: Action As Composition of Spatio Temporal Scene Graphs
 
Video Thumbnail Selector
Video Thumbnail SelectorVideo Thumbnail Selector
Video Thumbnail Selector
 
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
 
TVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using TitlesTVSum: Summarizing Web Videos Using Titles
TVSum: Summarizing Web Videos Using Titles
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?
 
Elderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detectionElderly Assistance- Deep Learning Theme detection
Elderly Assistance- Deep Learning Theme detection
 
Video Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonVideo Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
Video Key-Frame Extraction using Unsupervised Clustering and Mutual Comparison
 
Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...Fast object re-detection and localization in video for spatio-temporal fragme...
Fast object re-detection and localization in video for spatio-temporal fragme...
 
Forever Young: A Tribute to the Grandmaster through a recount of Personal Jou...
Forever Young: A Tribute to the Grandmaster through a recount of Personal Jou...Forever Young: A Tribute to the Grandmaster through a recount of Personal Jou...
Forever Young: A Tribute to the Grandmaster through a recount of Personal Jou...
 
Neural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settingsNeural networks for semantic gaze analysis in xr settings
Neural networks for semantic gaze analysis in xr settings
 
Campus_Abnormal_Behavior_Recognition_With_Temporal_Segment_Transformers.docx
Campus_Abnormal_Behavior_Recognition_With_Temporal_Segment_Transformers.docxCampus_Abnormal_Behavior_Recognition_With_Temporal_Segment_Transformers.docx
Campus_Abnormal_Behavior_Recognition_With_Temporal_Segment_Transformers.docx
 
A Comparison of People Counting Techniques via Video Scene Analysis
A Comparison of People Counting Techniques viaVideo Scene AnalysisA Comparison of People Counting Techniques viaVideo Scene Analysis
A Comparison of People Counting Techniques via Video Scene Analysis
 
A novel sketch based face recognition in unconstrained video for criminal inv...
A novel sketch based face recognition in unconstrained video for criminal inv...A novel sketch based face recognition in unconstrained video for criminal inv...
A novel sketch based face recognition in unconstrained video for criminal inv...
 
Fast Human Detection in Surveillance Video
Fast Human Detection in Surveillance VideoFast Human Detection in Surveillance Video
Fast Human Detection in Surveillance Video
 
Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...Fast object re detection and localization in video for spatio-temporal fragme...
Fast object re detection and localization in video for spatio-temporal fragme...
 
Optimal Repeated Frame Compensation Using Efficient Video Coding
Optimal Repeated Frame Compensation Using Efficient Video  CodingOptimal Repeated Frame Compensation Using Efficient Video  Coding
Optimal Repeated Frame Compensation Using Efficient Video Coding
 
The effects of visual realism on search tasks in mixed reality simulations-IE...
The effects of visual realism on search tasks in mixed reality simulations-IE...The effects of visual realism on search tasks in mixed reality simulations-IE...
The effects of visual realism on search tasks in mixed reality simulations-IE...
 
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
Shuffle and learn: Unsupervised Learning using Temporal Order Verification (U...
 

Dernier

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Dernier (20)

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

From Unsupervised to Semi-Supervised Event Detection

  • 1. From Unsupervised to Semi-Supervised Event Detection Wen-Sheng Chu Robotics Institute, Carnegie Mellon University July 9, 2013 1 Jeffery CohnFernando De la Torre
  • 2. Outline 1. Unsupervised Temporal Commonality Discovery (Chu et al, ECCV’12) 2. Personalized Facial Action Unit Detection (Chu et al, CVPR’13) 2
  • 3. Unsupervised Commonality Discovery in Images Where are the repeated patterns? 3 (Chu’10, Mukherjee’11, Collins’12)
  • 4. Unsupervised Commonality Discovery in Videos? • We name it Temporal Commonality Discovery (TCD). • Goal: Given two videos, discover common events in an unsupervised fashion. 4
  • 5. TCD is hard! 1) No prior knowledge on commonalities – We do not know what, where and how many commonalities exist in the video 2) Exhaustive search are computationally prohibitive – E.g., two videos with 300 frames have >8,000,000,000 possible matches. possible locations possible lengths possibilities/sequence Another possibilities/sequence 5
  • 9. Optimization: Branch-and-Bound • Similar to the idea of ESS (Lampert’08), we search the space by splitting intervals. 9
  • 11. 1. Bounding L1 distance: 2. Intersection similarity: 3. X2 distance: Optimization: Branch-and-Bound 11
  • 12. Unlikely search regions (B1,E1,B2,E2; -10) Searching Structure (B1,E1,B2,E2; 32) Priority queue (sorted by bound scores) … (B1,E1,B2,E2; -50) (B1,E1,B2,E2; -105) State S = (Rectangle set; score) 12
  • 13. (B1,E1,B2,E2; -105) Algorithm (B1,E1,B2,E2; 32) Priority queue (sorted by bound scores) … (B1,E1,B2,E2; -50) (B1,E1,B2,E2; -105) Top state 1. Pop out the top state 2. Split 13
  • 14. (B1,E1,B2,E2; -105) Algorithm (B1,E1,B2,E2; 32) Priority queue (sorted by bound scores) … (B1,E1,B2,E2; -50) Top state (B1,E’1,B2,E2; -76) (B1,E’’1,B2,E2; -61) 3. Compute bounding scores 4. Push back the split states 14
  • 15. Algorithm (B1,E1,B2,E2; 32) Priority queue (sorted by bound scores) … (B1,E1,B2,E2; -50) Top state (B1,E’1,B2,E2; -76) (B1,E’’1,B2,E2; -61) • The algorithm stop when the top state contains an unique rectangle. Omit most of the search space with large distances 15
  • 16. Compare with Relevant Work 1. Difference between TCD and ESS [1]/STBB[2] – Different learning framework: • Unsupervised v.s. Supervised – New bounding functions for TCD 2. Difference between TCD and [3] – Different objective: • Commonality Discovery v.s. Temporal Clustering [1] “Efficient subwindow search: A branch and bound framework for object localization”, PAMI 2009. [2] “Discriminative video pattern search for efficient action detection”, PAMI 2011. 16
  • 17. Experiment (1): Synthesized Sequence Histograms of the discovered pair of subsequences 17
  • 18. Experiment (2): Discover Common Facial Actions • RU-FACS dataset* – Interview videos with 29 subjects – 5000~8000 frames/video – Collect 100 segments that containing smiley mouths (AU- 12) – Evaluate in terms of averaged precision 18 * “Automatic recognition of facial actions in spontaneous expressions”, Journal of Multimedia 2006.
  • 19. Experiment (2): Discover Common Facial Actions 19
  • 20. • Parametric settings for Sliding Windows (SW) • Log of #evaluations: • Quality of discovered patterns: • a Experiment (2): Speed Evaluation Speed #evaluation of the distance function´ log nT C D nSW i d(r SW i ) ¡ d(r T C D ) 20
  • 21. Experiment (2): Discover Common Facial Actions • Compare with LCCS* on -distance 21 * “Frame-level temporal calibration of unsynchronized cameras by using Longest Consecutive Common Subsequence”, ICASSP 2009.
  • 22. Experiment (3): Discover Multiple Common Human Motions • CMU-Mocap dataset: – http://mocap.cs.cmu.edu/ • 15 sequences from Subject 86 • 1200~2600 frames and up to 10 actions/seq • Exclude the comparison with SW because it needs >1012 evaluations 22
  • 23. Experiment (3): Discover Multiple Common Human Motions 23
  • 24. Experiment (3): Discover Multiple Common Human Motions • Compare with LCCS* on -distance 24
  • 25. Extension: Video Indexing • Goal: Given a query , find the best common subsequence in the target video • A straightforward extension: Temporal Search Space 25
  • 26. A Prototype for Video Indexing 26
  • 28. Questions? [1+ “Common Visual Pattern Discovery via Spatially Coherent Correspondences,” In CVPR 2010. [2+ “MOMI-cosegmentation: simultaneous segmentation of multiple objects among multiple images,” In ACCV 2010. [3+ “Scale invariant cosegmentation for image groups,” In CVPR 2011. [4+ “Random walks based multi-image segmentation: Quasiconvexity results and GPU-based solutions,” In CVPR 2012. [5+ “Frame-level temporal calibration of unsynchronized cameras by using Longest Consecutive Common Subsequence,” In ICASSP 2009. [6+ “Efficient ESS with submodular score functions,” In CVPR 2011. 28 http://humansensing.cs.cmu.edu/wschu/
  • 29. Outline 1. Unsupervised Temporal Commonality Discovery (Chu et al, ECCV’12) 2. Selective Transfer Machine for Personalized Facial Action Unit Detection (Chu et al, CVPR’13) 29
  • 30. AU 6+12 Facial Action Units (AU) 30
  • 36. Selective Transfer Machine (STM) Formulation Maximizes margin of penalized SVM Minimize distribution mismatch 36
  • 37. Goal (1): Maximize penalized SVM margin margin penalized loss 37
  • 38. Goal (2): Minimize Distribution Mismatch • Kernel Mean Matching (KMM)* 38 * “Covariate shift by kernel mean matching”, Dataset shift in machine learning, 2009.
  • 39. Goal (2): Minimize Distribution Mismatch Groundtruth Bad estimator for testing data! 39
  • 40. Better fitting! Groundtruth Selection by reweighting training data 40 Goal (2): Minimize Distribution Mismatch
  • 41. 41
  • 44. Compare with Relevant Work 44 [1] "Covariate shift by kernel mean matching," Dataset shift in machine learning, 2009. [2] "Transductive inference for text classification using support vector machines," In ICML 1999. [3] "Domain adaptation problems: A DASVM classification technique and a circular validation strategy," PAMI 2010.
  • 45. Experiments • Features – SIFT descriptors on 49 facial landmarks – Preserve 98% energy using PCA 45 Datasets #Subjects #Videos #Frm/vid Content CK+ 123 593 ~20 NeutralPeak GEMEP-FERA 7 87 20~60 Acting RU-FACS 29 29 5000~7500 Interview
  • 47. • Two protocols – PS1: train/test are separate data of the same subject – PS2: training subjects include test subject (same protocol in [2]) • GEMEP-FERA Experiment (2): Comparison with Person- specific (PS) Classifiers 47
  • 48. Experiment (2): Selection Ability of STM 48
  • 49. • 123 subjects, 597 videos, ~20 frames/video Experiment (3): CK+ 49
  • 50. Experiment (4): GEMEP-FERA 50 • 7 subjects, 87 videos, 20~60 frames/video
  • 51. • 29 subjects, 29 videos, 5000~7000 frames/vid Experiment (5): RU-FACS 51
  • 52. Summary • Person-specific biases exist among face- related problems, esp. facial expression • We propose to alleviate the biases by personalizing classifiers using STM • Next – Joint optimization in terms of – Reduce the memory cost using SMO – Explore more potential biases in face problems, e.g., occurrence bias 52
  • 53. Questions? [1] "Covariate shift by kernel mean matching," Dataset shift in machine learning, 2009. [2] "Transductive inference for text classification using support vector machines," In ICML 1999. [3] "Domain adaptation problems: A DASVM classification technique and a circular validation strategy," PAMI 2010. *4+ “Integrating structured biological data by kernel maximum mean discrepancy”, Bioinformatics 2006. *5+ “Meta-analysis of the first facial expression recognition challenge,” IEEE Trans. on Systems, Man, and Cybernetics, Part B, 2012. 53 http://humansensing.cs.cmu.edu/wschu/