SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
Learning	
  Spa+otemporal	
  Graphs	
  of	
  
           Human	
  Ac+vi+es	
  

   William	
  Brendel	
     Sinisa	
  Todorovic	
  
Our Goal
     Long Jump              Triple Jump




•  Recognize all occurrences of activities
•  Identify the start and end frames
•  Parse the video and find all subactivities
•  Localize actors and objects involved
Weakly Supervised Setting
           Weight Lifting                          Large-Box Lifting




In	
  training:	
  
	
  	
  	
  	
  >	
  ONLY	
  class	
  labels	
  
	
  	
  


Domain	
  knowledge	
  of	
  temporal	
  structure:	
  
	
  	
  	
  	
  >	
  NOT	
  AVAILABLE	
  
Learning What and How

               Weak	
  supervision	
  in	
  training	
  
	
  
	
  
          Need	
  to	
  learn	
  from	
  training	
  videos:	
  
	
  
          What	
  ac+vity	
  parts	
  are	
  relevant	
  
	
  
       How	
  relevant	
  they	
  are	
  for	
  recogni+on	
  
Prior Work vs. Our Approach
Typically, focus
 only on HOW
semantic level


    model
                   gap

   features

  raw video
Prior Work vs. Our Approach

 Typically...
semantic level         semantic level

                          model
   model
                 gap     mid-level
                         features
   features

  raw video              raw video
Prior Work – Video Representation

       •  Space-­‐+me	
  points	
  
         –	
  Laptev	
  &	
  Schmid	
  08,	
  Niebles	
  &	
  Fei-­‐Fei	
  08,	
  …	
  


       •  S+ll	
  human	
  postures	
  
         –	
  SoaLo	
  07,	
  Ning	
  &	
  Huang	
  08,	
  …	
  


       •  Ac+on	
  templates	
  
         –	
  Yao	
  &	
  Zhu	
  09,	
  …	
  


       •  Point	
  tracks	
  
         –	
  Sukthankar	
  &	
  Hebert	
  10,	
  …	
  
	
  
Our Features: 2D+t Tubes


       •  Allow	
  simpler:	
  
           -­‐	
  Modeling	
  
           -­‐	
  Learning	
  (few	
  examples)	
  	
        Sukthankar & Hebert 07,

           -­‐	
  Inference	
                                  Gorelick & Irani 08,
                                                               Pritch & Peleg 08, ...

	
  
	
  
       •  We	
  are	
  the	
  first	
  to	
  use	
  2D+t	
  tubes	
  for	
  
          building	
  a	
  sta+s+cal	
  model	
  of	
  ac+vi+es	
  
Our Features: 2D+t Tubes


       •  Allow	
  simpler:	
  
           -­‐	
  Modeling	
  
           -­‐	
  Learning	
  (few	
  examples)	
  	
     Sukthankar & Hebert 07,

           -­‐	
  Inference	
                               Gorelick & Irani 08,
                                                           Pritch & Peleg 08, ...

	
  
	
  
       •  We	
  use	
  2D+t	
  tubes	
  for	
  building	
  a	
  sta+s+cal	
  
          genera+ve	
  model	
  of	
  ac+vi+es	
  
Prior Work – Activity Representation
       •  Graphical	
  models,	
  Grammars	
  
        -­‐	
  Ivanov	
  &	
  Bobick	
  00	
  
        -­‐	
  Xiang	
  &	
  Gong	
  06	
  
        -­‐	
  Ryoo	
  &	
  Aggawal	
  09	
  
        -­‐	
  Gupta	
  &	
  Davis	
  09	
  
        -­‐	
  Liu	
  &	
  Zhu	
  09	
  
        -­‐	
  Niebles	
  &	
  Fei-­‐Fei	
  10	
  
        -­‐	
  Lan	
  et	
  al.	
  11	
  	
  
	
  

       •  Probabilis+c	
  first-­‐order	
  logic	
  
        -­‐	
  Tran	
  &	
  Davis	
  08	
  
        -­‐	
  Albanese	
  et	
  al.	
  10	
  
        -­‐	
  Morariu	
  &	
  Davis	
  11	
  
        -­‐	
  Brendel	
  et	
  al.	
  11...	
  
Approach




	
  	
  	
  	
  Input	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Spa+otemporal	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Ac+vity	
  	
  	
  	
  	
  	
  	
  	
  Recogni+on	
  
	
  	
  	
  	
  Video	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Graph	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Model	
  	
  	
  	
  	
  	
  	
  	
  	
  Localiza+on	
  
Blocky Video Segmentation
Activity as a Spatiotemporal Graph
            Descriptors of nodes and edges:

            •  Node descriptors:       F
                 - Motion
                 - Object shape

            •  Adjacency Matrices:     {Ai}
                 - Allen temporal relations
                 - Spatial relations
                 - Compositional relations
Activity as Segmentation Graph

          G = (V, E, "descriptors")

            = (F, {A1, ..., An})
           node descriptors

                   adjacency matrices
                   of distinct relations
                   between the tubes
Activity Graph Model
             Probabilistic Graph Mixture



 model node descriptors           mixture weights

           model adjacency matrices
compositional       spatial            temporal

  *         +      *          +         *
Activity Model
   An	
  ac+vity	
  instance: G = (F, {A1,..., An})




Model adjacency matrices
Edge type: i =1, 2,..., n
Activity Model
   An	
  ac+vity	
  instance: G = (F, {A1,..., An})




Model adjacency matrices            Model matrix of
                                   node descriptors
Edge type: i =1, 2,..., n
Inference




	
  	
  	
  	
  Input	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Spa+otemporal	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Ac+vity	
  	
  	
  	
  	
  	
  	
  	
  Recogni+on	
  
	
  	
  	
  	
  Video	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Graph	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Model	
  	
  	
  	
  	
  	
  	
  	
  	
  Localiza+on	
  
Inference = Robust Least Squares
   Goal:	
  	
  
     • For	
  every	
  ac+vity	
  model	
  
     • Es+mate	
  the	
  permuta+on	
  matrix	
  




subject to
Learning the Activity Graph Model




Training	
  videos	
  →	
  Training	
  graphs	
  →	
  Graph	
  model	
  	
  
	
  
Learning

 Given K training graphs,


	
  	
  	
  	
  	
  	
  	
  Adjacency	
  matrix	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Node	
  descriptor	
  



  Edge type: i =1, 2,..., n
Learning

Given K training graphs,                                                                     ESTIMATE

	
  	
  	
  	
  	
  	
  	
  Adjacency	
  matrix	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Node	
  descriptor	
  




                                         Model parameters
Learning

Given K training graphs,                                                                     ESTIMATE

	
  	
  	
  	
  	
  	
  	
  Adjacency	
  matrix	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Node	
  descriptor	
  




   Permutation matrix
Learning = Robust Least Squares
Given	
  K	
  
Training	
  
graphs:	
  

Es4mate:	
           and	
  
Learning = Structural EM

E-step à expected   M-step à matching of the
 model structure     training graphs and model




  Estimatation of          Estimation of
 model parametrs        permutation matrices
Learning Results




       Correctly	
  learned	
  ac+vity-­‐characteris+c	
  tubes	
  
	
  
Recognition and Segmentation




                    Ac+vity	
  “handshaking”	
  
       Detected	
  and	
  segmented	
  characteris+c	
  tube	
  
	
  
Recognition and Segmentation




                        Ac+vity	
  “kicking”	
  
       Detected	
  and	
  segmented	
  characteris+c	
  tube	
  
	
  
Classification on UTexas Dataset




     Human	
  interac+on	
  ac+vi+es	
  
             [18]	
  Ryoo	
  et	
  al.	
  ’10	
  
Conclusion

•  Fast spatiotemporal segmentation

•  New activity representation = graph model

•  Unified learning and inference = Least squares

•  Learning under weak supervision:

     - WHAT activity parts are relevant and

     - HOW relevant they are for recognition

Contenu connexe

Tendances

Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Linear circuit and superposition
Linear circuit and superpositionLinear circuit and superposition
Linear circuit and superpositionlipschitzembed
 
Lecture11
Lecture11Lecture11
Lecture11Bo Li
 
M Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
M Gumbel - SCABIO: a framework for bioinformatics algorithms in ScalaM Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
M Gumbel - SCABIO: a framework for bioinformatics algorithms in ScalaJan Aerts
 

Tendances (6)

Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Gc.d day
Gc.d dayGc.d day
Gc.d day
 
Linear circuit and superposition
Linear circuit and superpositionLinear circuit and superposition
Linear circuit and superposition
 
Lecture11
Lecture11Lecture11
Lecture11
 
Ben Gal
Ben Gal Ben Gal
Ben Gal
 
M Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
M Gumbel - SCABIO: a framework for bioinformatics algorithms in ScalaM Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
M Gumbel - SCABIO: a framework for bioinformatics algorithms in Scala
 

En vedette

Review paper human activity analysis
Review paper human activity analysisReview paper human activity analysis
Review paper human activity analysisIftikhar Alam
 
Business, Nature & Purpose By Ms. Bindu Dewan
Business, Nature & Purpose By Ms. Bindu DewanBusiness, Nature & Purpose By Ms. Bindu Dewan
Business, Nature & Purpose By Ms. Bindu Dewankulachihansraj
 
nature and purpose of business.
nature and purpose of business.nature and purpose of business.
nature and purpose of business.Sruthy Ajith
 
Mkt#210 lecture 2 factors affecting entrepreneurship development
Mkt#210 lecture 2 factors affecting entrepreneurship developmentMkt#210 lecture 2 factors affecting entrepreneurship development
Mkt#210 lecture 2 factors affecting entrepreneurship developmentKawser Ahmad Sohan
 
Professional ethics presentation
Professional ethics presentationProfessional ethics presentation
Professional ethics presentationSkillet Tony
 
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model Nikhil Soares
 

En vedette (7)

Review paper human activity analysis
Review paper human activity analysisReview paper human activity analysis
Review paper human activity analysis
 
Business, Nature & Purpose By Ms. Bindu Dewan
Business, Nature & Purpose By Ms. Bindu DewanBusiness, Nature & Purpose By Ms. Bindu Dewan
Business, Nature & Purpose By Ms. Bindu Dewan
 
nature and purpose of business.
nature and purpose of business.nature and purpose of business.
nature and purpose of business.
 
Mkt#210 lecture 2 factors affecting entrepreneurship development
Mkt#210 lecture 2 factors affecting entrepreneurship developmentMkt#210 lecture 2 factors affecting entrepreneurship development
Mkt#210 lecture 2 factors affecting entrepreneurship development
 
How do people destroy natural resources
How do people destroy natural resourcesHow do people destroy natural resources
How do people destroy natural resources
 
Professional ethics presentation
Professional ethics presentationProfessional ethics presentation
Professional ethics presentation
 
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
 

Similaire à Iccv2011 learning spatiotemporal graphs of human activities

Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep diveabulyomon
 
Python image processing_Python image processing.pptx
Python image processing_Python image processing.pptxPython image processing_Python image processing.pptx
Python image processing_Python image processing.pptxshashikant484397
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsLiwei Ren任力偉
 
JS Responsibilities
JS ResponsibilitiesJS Responsibilities
JS ResponsibilitiesBrendan Eich
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Stefan Urbanek
 
Aj2418721874
Aj2418721874Aj2418721874
Aj2418721874IJMER
 
6 large-scale-learning.pptx
6 large-scale-learning.pptx6 large-scale-learning.pptx
6 large-scale-learning.pptxmustafa sarac
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Finite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy LrFinite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy Lrguesta32562
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiersSolin TEM
 
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...Databricks
 
MathWorks Interview Lecture
MathWorks Interview LectureMathWorks Interview Lecture
MathWorks Interview LectureJohn Yates
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basicsnpinto
 
Functional solid
Functional solidFunctional solid
Functional solidMatt Stine
 
Machine Learning for objective QoE assessment: Science, Myths and a look to t...
Machine Learning for objective QoE assessment: Science, Myths and a look to t...Machine Learning for objective QoE assessment: Science, Myths and a look to t...
Machine Learning for objective QoE assessment: Science, Myths and a look to t...Förderverein Technische Fakultät
 

Similaire à Iccv2011 learning spatiotemporal graphs of human activities (20)

Regression: A skin-deep dive
Regression: A skin-deep diveRegression: A skin-deep dive
Regression: A skin-deep dive
 
MATLAB & Image Processing
MATLAB & Image ProcessingMATLAB & Image Processing
MATLAB & Image Processing
 
Python image processing_Python image processing.pptx
Python image processing_Python image processing.pptxPython image processing_Python image processing.pptx
Python image processing_Python image processing.pptx
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical Problems
 
JS Responsibilities
JS ResponsibilitiesJS Responsibilities
JS Responsibilities
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)
 
Aj2418721874
Aj2418721874Aj2418721874
Aj2418721874
 
6 large-scale-learning.pptx
6 large-scale-learning.pptx6 large-scale-learning.pptx
6 large-scale-learning.pptx
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Finite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy LrFinite Element Analysis Made Easy Lr
Finite Element Analysis Made Easy Lr
 
Ds & ada
Ds & adaDs & ada
Ds & ada
 
Workshop Mock-Ups
Workshop Mock-UpsWorkshop Mock-Ups
Workshop Mock-Ups
 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiers
 
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...Continuous Evaluation of Deployed Models in Production Many high-tech industr...
Continuous Evaluation of Deployed Models in Production Many high-tech industr...
 
Av Recognition
Av RecognitionAv Recognition
Av Recognition
 
MathWorks Interview Lecture
MathWorks Interview LectureMathWorks Interview Lecture
MathWorks Interview Lecture
 
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
[Harvard CS264] 03 - Introduction to GPU Computing, CUDA Basics
 
Functional solid
Functional solidFunctional solid
Functional solid
 
Machine Learning for objective QoE assessment: Science, Myths and a look to t...
Machine Learning for objective QoE assessment: Science, Myths and a look to t...Machine Learning for objective QoE assessment: Science, Myths and a look to t...
Machine Learning for objective QoE assessment: Science, Myths and a look to t...
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
 

Dernier

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Dernier (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 

Iccv2011 learning spatiotemporal graphs of human activities

  • 1. Learning  Spa+otemporal  Graphs  of   Human  Ac+vi+es   William  Brendel   Sinisa  Todorovic  
  • 2. Our Goal Long Jump Triple Jump •  Recognize all occurrences of activities •  Identify the start and end frames •  Parse the video and find all subactivities •  Localize actors and objects involved
  • 3. Weakly Supervised Setting Weight Lifting Large-Box Lifting In  training:          >  ONLY  class  labels       Domain  knowledge  of  temporal  structure:          >  NOT  AVAILABLE  
  • 4. Learning What and How Weak  supervision  in  training       Need  to  learn  from  training  videos:     What  ac+vity  parts  are  relevant     How  relevant  they  are  for  recogni+on  
  • 5. Prior Work vs. Our Approach Typically, focus only on HOW semantic level model gap features raw video
  • 6. Prior Work vs. Our Approach Typically... semantic level semantic level model model gap mid-level features features raw video raw video
  • 7. Prior Work – Video Representation •  Space-­‐+me  points   –  Laptev  &  Schmid  08,  Niebles  &  Fei-­‐Fei  08,  …   •  S+ll  human  postures   –  SoaLo  07,  Ning  &  Huang  08,  …   •  Ac+on  templates   –  Yao  &  Zhu  09,  …   •  Point  tracks   –  Sukthankar  &  Hebert  10,  …    
  • 8. Our Features: 2D+t Tubes •  Allow  simpler:   -­‐  Modeling   -­‐  Learning  (few  examples)     Sukthankar & Hebert 07, -­‐  Inference   Gorelick & Irani 08, Pritch & Peleg 08, ...     •  We  are  the  first  to  use  2D+t  tubes  for   building  a  sta+s+cal  model  of  ac+vi+es  
  • 9. Our Features: 2D+t Tubes •  Allow  simpler:   -­‐  Modeling   -­‐  Learning  (few  examples)     Sukthankar & Hebert 07, -­‐  Inference   Gorelick & Irani 08, Pritch & Peleg 08, ...     •  We  use  2D+t  tubes  for  building  a  sta+s+cal   genera+ve  model  of  ac+vi+es  
  • 10. Prior Work – Activity Representation •  Graphical  models,  Grammars   -­‐  Ivanov  &  Bobick  00   -­‐  Xiang  &  Gong  06   -­‐  Ryoo  &  Aggawal  09   -­‐  Gupta  &  Davis  09   -­‐  Liu  &  Zhu  09   -­‐  Niebles  &  Fei-­‐Fei  10   -­‐  Lan  et  al.  11       •  Probabilis+c  first-­‐order  logic   -­‐  Tran  &  Davis  08   -­‐  Albanese  et  al.  10   -­‐  Morariu  &  Davis  11   -­‐  Brendel  et  al.  11...  
  • 11. Approach        Input                                    Spa+otemporal                                              Ac+vity                Recogni+on          Video                                                    Graph                                                                Model                  Localiza+on  
  • 13. Activity as a Spatiotemporal Graph Descriptors of nodes and edges: •  Node descriptors: F - Motion - Object shape •  Adjacency Matrices: {Ai} - Allen temporal relations - Spatial relations - Compositional relations
  • 14. Activity as Segmentation Graph G = (V, E, "descriptors") = (F, {A1, ..., An}) node descriptors adjacency matrices of distinct relations between the tubes
  • 15. Activity Graph Model Probabilistic Graph Mixture model node descriptors mixture weights model adjacency matrices compositional spatial temporal * + * + *
  • 16. Activity Model An  ac+vity  instance: G = (F, {A1,..., An}) Model adjacency matrices Edge type: i =1, 2,..., n
  • 17. Activity Model An  ac+vity  instance: G = (F, {A1,..., An}) Model adjacency matrices Model matrix of node descriptors Edge type: i =1, 2,..., n
  • 18. Inference        Input                                    Spa+otemporal                                              Ac+vity                Recogni+on          Video                                                    Graph                                                                Model                  Localiza+on  
  • 19. Inference = Robust Least Squares Goal:     • For  every  ac+vity  model   • Es+mate  the  permuta+on  matrix   subject to
  • 20. Learning the Activity Graph Model Training  videos  →  Training  graphs  →  Graph  model      
  • 21. Learning Given K training graphs,              Adjacency  matrix                            Node  descriptor   Edge type: i =1, 2,..., n
  • 22. Learning Given K training graphs, ESTIMATE              Adjacency  matrix                                Node  descriptor   Model parameters
  • 23. Learning Given K training graphs, ESTIMATE              Adjacency  matrix                                Node  descriptor   Permutation matrix
  • 24. Learning = Robust Least Squares Given  K   Training   graphs:   Es4mate:   and  
  • 25. Learning = Structural EM E-step à expected M-step à matching of the model structure training graphs and model Estimatation of Estimation of model parametrs permutation matrices
  • 26. Learning Results Correctly  learned  ac+vity-­‐characteris+c  tubes    
  • 27. Recognition and Segmentation Ac+vity  “handshaking”   Detected  and  segmented  characteris+c  tube    
  • 28. Recognition and Segmentation Ac+vity  “kicking”   Detected  and  segmented  characteris+c  tube    
  • 29. Classification on UTexas Dataset Human  interac+on  ac+vi+es   [18]  Ryoo  et  al.  ’10  
  • 30. Conclusion •  Fast spatiotemporal segmentation •  New activity representation = graph model •  Unified learning and inference = Least squares •  Learning under weak supervision: - WHAT activity parts are relevant and - HOW relevant they are for recognition