SlideShare une entreprise Scribd logo
1  sur  58
Télécharger pour lire hors ligne
Structural	
  Methods	
  
G. Johansson, Moving Light Displays, 1973




•  Pose	
  seems	
  sufficient	
  for	
  certain	
  ac5on	
  categories	
  
•  Remove	
  effects	
  of	
  clothing,	
  ligh5ng	
  varia5on	
  from	
  representa5on	
  
•  Two	
  broad	
  classes	
  of	
  approaches	
  
   –  Match	
  templates	
  (exemplar-­‐based)	
  
   –  Fit	
  a	
  human	
  body	
  model	
  
Mori & Malik PAMI 2005
Shakhnarovich, Viola & Darrell ICCV 2003
Bourdev & Malik ICCV 2009

                                           Database of Exemplars
Pictorial Structures
model




 Felzenszwalb & Huttenlocher CVPR 2000
 Ramanan NIPS 2006
 Ferrari, Marin & Zisserman CVPR 2008
•  Detect	
  corners	
  in	
  images/video	
  
   •  Assess	
  likelihood	
  under	
  ac5on-­‐specific	
  pose	
  model	
  
   •  Discriminate	
  between	
  walking	
  direc5ons,	
  bicycle	
  riding	
  


Song, Goncalves & Perona NIPS 2001, PAMI 2003
test sequence
key frame


•  Key	
  frame	
  matching	
  to	
  test	
  sequence	
  to	
  find	
  
   similar	
  poses	
  
    –  Shape	
  matching	
  on	
  edge	
  maps	
  using	
  order	
  
       structure	
  
                                                  Sullivan & Carlsson ECCV 2002
•  M	
  is	
  quan5zed	
  3d	
  pose	
  
                                                                 •  T	
  is	
  root	
  orienta5on	
  	
  




•  Automa5c	
  person	
  detec5on-­‐tracking	
  
•  Compare	
  quan5zed	
  pose	
  to	
  labeled	
  training	
  poses	
  
     –  Smooth	
  over	
  5me	
  

                                                              Ramanan & Forsyth NIPS 2003
•  Video	
  shot	
  retrieval	
  from	
  pose	
  
        –  Either	
  query-­‐by-­‐example	
  or	
  classifica5on	
  
        –  Focus	
  on	
  upper	
  body	
  pose	
  
            •  Pictorial	
  structures	
  model	
  


                                                                      Ac5on	
  
                                                                      Label	
  


Ferrari, Marin & Zisserman CVPR 2009
query
                                          CODE AVAILABLE ONLINE
•  SVM	
  on	
  descriptors	
  of	
  absolute	
  &	
  rela5ve	
  part	
  loca5ons,	
  
   segmenta5ons	
  
     –  Include	
  short	
  tracks	
  for	
  robustness	
  
Golfing?	
  



                                                                       Walking?	
  



   •  Focus	
  on	
  discrimina5ve	
  elements	
  of	
  pose	
  for	
  ac5on	
  classifica5on	
  
   •  Use	
  exemplar-­‐based	
  “poselet”	
  representa5on	
  


Yang, Wang & Mori CVPR 2010
Successful
classification
  examples




Unsuccessful
classification
  examples



         17
A:
                                                      Activity
                                                                 A

      Tennis Croquet Volleyball                                           Human pose
     forehand shot    smash
                                                                              H
O:                                                Object
                                                       O
      Tennis Croquet Volleyball                                                    Body parts
      racket mallet
                                                                 P1      P2        PN
H:
                                                      fO
                                                                 f1      f2        fN
         Intra-class variations
          •  More than one H for each A;              Image evidence
          •  Unobserved during training.

P:    lP: location; θP: orientation; sP: scale.

f:    Shape context. [Belongie et al, 2002]
                                                                      Yao & Fei-Fei CVPR 2010
Cricket
defensive
  shot




Cricket
bowling




Croquet
 shot




            19
•  Build	
  ac5on	
  models	
  from	
  web	
  search	
  results	
  


Ikizler-Cinbis, Cinbis, Sclaroff ICCV 2009
SLAG
            •  Find	
  repeated	
  poses	
  in	
  a	
  dataset	
  
Wang, Jiang, Drew, Li, Mori CVPR 2006
•  Person	
  loca5on	
  
Riding horse   Reading book      Taking photo
                                                   given	
  
                                                •  Classify	
  into	
  one	
  of	
  9	
  
                                                   categories	
  


Riding bike    Play instrument     Running




  Phoning      Use computer        Walking
•  Pose	
  as	
  representa5on	
  for	
  ac5on	
  recogni5on	
  
    –  Captures	
  much	
  informa5on	
  about	
  ac5on	
  
    –  Invariance	
  to	
  clothing	
  /	
  ligh5ng	
  effects	
  
    –  Model	
  and	
  exemplar	
  based	
  representa5ons	
  
•  New	
  direc5on:	
  Ac5on	
  recogni5on	
  from	
  s5ll	
  
   images	
  
    –  Image	
  retrieval	
  and	
  analysis	
  
    –  An	
  important	
  cue	
  for	
  video-­‐based	
  ac5on	
  recogni5on	
  
    –  Pose	
  seems	
  essen5al	
  
•  Describe	
  low-­‐level	
  components	
  
    –  Ac5ons	
  of	
  individual	
  people	
  
    –  Movement	
  of	
  pixels	
  
•  Iden5fy	
  key	
  objects	
  or	
  loca5ons	
  in	
  scene	
  
    –  Buildings,	
  roads,	
  etc.	
  
•  Model	
  interac5ons	
  between	
  people,	
  objects,	
  
   and	
  loca5ons	
  
•  Detect	
  and	
  track	
  moving	
  
     objects	
  
  •  Manually	
  iden5fy	
  key	
  
     regions	
  in	
  scene	
  
      –  E.g.	
  road,	
  checkpoint	
  
  •  Scenarios	
  describe	
  
     rela5ve	
  arrangements	
  of	
  
     objects	
  in	
  scene	
  
      –  E.g.	
  proximity	
  of	
  car	
  to	
  
         checkpoint	
  
      –  No5ons	
  of	
  scene	
  context	
  


Medioni, Cohen, Bremond, Hongeng, Nevatia PAMI 2001
•  Detect	
  and	
  track	
  players,	
  
     ball	
  
  •  Low-­‐level	
  ac5on	
  
     detectors	
  for	
  individual	
  
     players	
  
  •  Hand-­‐constructed	
  Bayes	
  
     net	
  for	
  each	
  ac5vity	
  
       –  Spa5al	
  and	
  temporal	
  
          rela5ons	
  between	
  low-­‐
          level	
  ac5ons	
  


Intille & Bobick CVPR 1999
•  Global,	
  frame-­‐level	
  feature	
  
    –  Bag-­‐of-­‐words	
  representa5on	
  
•  Detect	
  unusual	
  events	
  by	
  clustering	
  
    –  Isolated,	
  varied	
  clusters	
  are	
  unusual	
  
                                                               Zhong, Shi & Visontai CVPR 2004
•  Chea5ng	
  detec5on	
  in	
     •  Real-­‐world	
  highway	
  dataset	
  
   simulated	
  card	
  game	
         –  Cars	
  pulling	
  off	
  road,	
  backing	
  
                                          up,	
  U-­‐turns	
  
•  Describe	
  moving	
  pixels	
  
   by	
  loca5on	
  and	
  mo5on	
  
   direc5on	
  
    –  No	
  object	
  detec5on	
  
•  Use	
  as	
  visual	
  words	
  in	
  
   Latent	
  Dirichlet	
  
   Alloca5on	
  (LDA)	
  type	
  
   model	
  
    –  Infer	
  low-­‐level	
  ac5ons	
  
       from	
  words	
  


Wang, Ma, Grimson PAMI 2009                 Blei, Ng, Jordan JMLR 2003
•  Higher-­‐level	
  ac5vity	
  analysis	
  
     –  Distribu5on	
  of	
  low-­‐level	
  ac5ons	
  over	
  en5re	
  scene	
  
•  Applica5ons	
  
     –  Temporal	
  segmenta5on	
  by	
  ac5vity	
  
     –  Abnormality	
  detec5on	
  
0.8
               A	
              B	
  
              0.3              0.5
                       C	
  


•  Hierarchical	
  Dirichlet	
  
   Process	
  model	
  
    –  Learn	
  number	
  of	
  ac5vi5es	
  
       automa5cally	
  

                                        Kuettel, Breitenstein, van Gool & Ferrari CVPR 2010
traffic	
  light	
  controlled	
  scene	
  




                                  •  con5nuous	
  video	
  
                                  •  annotated	
  with	
  states	
  and	
  history	
  
                                  •  3x	
  speed	
  
Loy, Xiang & Gong CVPR, ICCV 2009
•  Consider	
  5me-­‐delayed	
  correla5ons	
  between	
  regions	
  
   –  Applica5ons	
  to	
  irregularity	
  detec5on	
  
y	
   activity class


         h1    h2            y	
  
                             h action class
                     …

         x1    x2            xn      image
  x0




Choi, Shahid, & Savarese VS 2009
Lan, Wang, Yang, & Mori SGA 2010, NIPS 2010
•  Cap5oned	
  baseball	
  videos	
  
     in	
  training	
  
  •  Build	
  AND-­‐OR	
  graph	
  
     representa5on	
  of	
  ac5vi5es	
  
       –  AND	
  specifies	
  elements	
  of	
  
          an	
  ac5vity	
  that	
  must	
  occur	
  
       –  OR	
  allows	
  varia5on	
  in	
  how	
  
          an	
  element	
  appears	
  
  •  Describe	
  low-­‐level	
  tracks	
  
     using	
  STIPs	
  
  •  Match	
  tracks	
  to	
  ac5ons	
  in	
  
     AND-­‐OR	
  graph	
  
Gupta, Srinivasan, Shi, Davis CVPR 2009
•  Scene	
  modeling	
  to	
  look	
  at	
  the	
  big	
  picture	
  
•  Feature	
  representa5ons	
  
    –  Holis5c:	
  describe	
  en5re	
  scene,	
  irrespec5ve	
  of	
  
       individuals	
  
    –  Local:	
  describe	
  ac5ons	
  of	
  individuals	
  
•  Structure	
  of	
  ac5vi5es	
  
    –  Model	
  free:	
  clustering-­‐type	
  approaches	
  
    –  Strong	
  models:	
  grammars,	
  probabilis5c	
  models	
  
Objects:                Actions:
    cars, glasses,      drinking, running,
    people, etc…        door exit, car enter

                constraints

Scene categories:
                        Geometry:
   indoors,
   outdoors, street        Street, wall,
   scene, etc…             field, stair, etc…
Riding horse   Reading book
                                 A 0.8 B

                                 0.3 C 0.5
 Riding bike   Play instrument
•  Standardiza5on	
  of	
  datasets	
  for	
  field	
  
   –  Allow	
  comparison	
  of	
  algorithms	
  
       •  E.g.	
  KTH	
  for	
  low-­‐level	
  features,	
  atomic	
  ac5ons	
  
   –  Fair	
  tuning	
  of	
  model	
  parameters	
  
•  New	
  algorithms	
  compare	
  to	
  baselines	
  
   –  Bag-­‐of-­‐words	
  on	
  densely	
  sampled	
  STIPs	
  
   –  Pose	
  es5ma5on	
  (Ferrari	
  et	
  al.	
  code)	
  
   –  HOG	
  SVM	
  (Dalal	
  &	
  Triggs	
  code,	
  Ramanan	
  code)	
  
•  Standardiza5on	
  of	
  datasets	
  for	
  field	
  
   –  Don’t	
  feel	
  constrained	
  by	
  the	
  exis5ng	
  problem	
  
      defini5ons	
  
   –  Do	
  make	
  your	
  new	
  dataset	
  available	
  
       •  Should	
  clearly	
  specify	
  separate	
  training	
  and	
  test	
  sets	
  
•  New	
  algorithms	
  compare	
  to	
  baselines	
  
   –  Do	
  use	
  reasonable	
  variant	
  of	
  standard	
  baselines	
  
      for	
  your	
  new	
  problem	
  
•  Even	
  atomic	
  low-­‐level	
  ac5ons	
  are	
  very	
  difficult	
  
   to	
  detect	
  reliably	
  
    –  Far	
  more	
  work	
  needed	
  on	
  representa5ons	
  for	
  the	
  
       ac5on	
  of	
  a	
  single	
  person	
  
    –  Features	
  
    –  Temporal	
  representa5on,	
  smoothing	
  
    –  Tracking	
  
    –  …	
  
1.  Cameras and bandwidth are cheap
2.  Lots of training data is potentially available



                                             Training
                            +              =
                                               data




         Potential for the huge progress
                           … if we can get the data
Aligned with video	
     Describes visual              Source	
  
                                                        content	
  


Subtitles	
                         Yes                     No	
               DVD, Internet

Scripts for                          No	
                  Yes	
              Internet, e.g.
TV series, movies and                                                       www.dailyscript.com	
  
sport games
Plot summaries and                   No	
             Yes, sparsely	
             Internet
synopses                                                                        (e.g. IMDB)	
  

Instruction videos                   No	
                  Yes	
              Internet, e.g.
                                                                            www.videojug.com	
  

Descriptive Video Service           Yes	
                  Yes	
                 DVD, rare	
  


Word tags                            No	
             Yes, sparsely	
              Internet
                                                                               (e.g. YouTube)	
  

Manual labelling,                    ??	
                   ??	
             Mechanical Turk,
Human Computation                                                               ESP Game,
                                                                          Grad undergrad students
Open questions:

     How to benefit from the structure of the human body in
      complex situations, e.g. heavy occlusions, uniformly colored
      clothing?
     Will action classification generalize over different video
      domains: Movies, TV, YouTube, Surveillance video?
     What is the useful action vocabulary? Are we trying to solve the
      right problem? How can we visualize/display the results?
Interesting novel directions:

     Use actions for recognizing functional and physical object
      properties, e.g. “sitable”, “eatable”, “heavy”, “solid” objects…
     Action prediction, i.e. what can happen in the given situation:
      e.g. is it dangerous to cross this road?
     Explore more sources of strong and weak supervision: Manual
      surveillance, Descriptive Video Service (DVS), YouTube tags;
      Transcripts of sports games; Instruction videos.
•    P.	
  Viola,	
  M.	
  Jones,	
  and	
  D.	
  Snow.	
  Detec5ng	
  pedestrians	
  using	
  paeerns	
  of	
  mo5on	
  and	
  appearance.	
  In	
  
     Proc.	
  9th	
  Int.	
  Conf.	
  Computer	
  Vision,	
  pages	
  734–741,	
  2003.	
  
•    N.	
  Dalal	
  and	
  B.	
  Triggs.	
  Histograms	
  of	
  oriented	
  gradients	
  for	
  human	
  detec5on.	
  In	
  Proc.	
  IEEE	
  Comput.	
  
     Soc.	
  Conf.	
  Comput.	
  Vision	
  and	
  Paeern	
  Recogn.,	
  2005.	
  
•    Bo	
  Wu	
  and	
  Ram	
  Neva5a.	
  Detec5on	
  of	
  mul5ple,	
  par5ally	
  occluded	
  humans	
  in	
  a	
  single	
  image	
  by	
  
     bayesian	
  combina5on	
  of	
  edgelet	
  part	
  detectors.	
  In	
  Proc.	
  10th	
  Int.	
  Conf.	
  Computer	
  Vision,	
  2005.	
  
•    Pedro	
  Felzenszwalb,	
  David	
  McAllester,	
  and	
  Deva	
  Ramanan.	
  A	
  discrimina5vely	
  trained,	
  mul5scale,	
  
     deformable	
  part	
  model.	
  In	
  IEEE	
  Computer	
  Society	
  Conference	
  on	
  Computer	
  Vision	
  and	
  Paeern	
  
     Recogni5on,	
  2008.	
  
•    Chris	
  Stauffer	
  and	
  W.E.L.	
  Grimson.	
  Adap5ve	
  background	
  mixture	
  models	
  for	
  real-­‐	
  5me	
  tracking.	
  In	
  
     Proc.	
  IEEE	
  Comput.	
  Soc.	
  Conf.	
  Comput.	
  Vision	
  and	
  Paeern	
  Recogn.,	
  1999.	
  
•    Kentaro	
  Toyama,	
  John	
  Krumm,	
  Barry	
  Brumie,	
  and	
  Brian	
  Meyers.	
  Wallflower:	
  Principles	
  and	
  prac5ce	
  
     of	
  background	
  maintenance.	
  In	
  Proc.	
  7th	
  Int.	
  Conf.	
  Computer	
  Vision,	
  1999.	
  
•    J.L.	
  Barron,	
  D.J.	
  Fleet,	
  and	
  S.S.	
  Beauchemin.	
  Performance	
  of	
  op5cal	
  flow	
  techniques.	
  Int.	
  Journal	
  of	
  
     Computer	
  Vision,	
  12(1):43–77,	
  1994.	
  
•    T.	
  Brox,	
  C.	
  Bregler,	
  and	
  J.	
  Malik.	
  Large	
  displacement	
  op5cal	
  flow.	
  In	
  Proc.	
  IEEE	
  Comput.	
  Soc.	
  Conf.	
  
     Comput.	
  Vision	
  and	
  Paeern	
  Recogn.,	
  2009.	
  
•    M.	
  Isard	
  and	
  A.	
  Blake.	
  Condensa5on	
  –	
  condi5onal	
  density	
  propaga5on	
  for	
  visual	
  tracking.	
  Int.	
  
     Journal	
  of	
  Computer	
  Vision,	
  29(1):5–28,	
  1998.	
  
•    Yuan	
  Li,	
  Chang	
  Huang,	
  and	
  Ram	
  Neva5a.	
  Learning	
  to	
  associate:	
  Hybridboosted	
  mul5-­‐target	
  tracker	
  
     for	
  crowded	
  scene.	
  In	
  Proc.	
  IEEE	
  Comput.	
  Soc.	
  Conf.	
  Comput.	
  Vision	
  and	
  Paeern	
  Recogn.,	
  2009.	
  
•    W.	
  T.	
  Freeman,	
  K.	
  Tanaka,	
  J.Ohta,	
  and	
  K.	
  Kyuma.	
  Computer	
  vision	
  for	
  computer	
  games.	
  In	
  IEEE	
  2nd	
  
     Intl.	
  Conf.	
  on	
  Automa5c	
  Face	
  and	
  Gesture	
  Recogni5on,	
  1996.	
  
•    J.	
  Sullivan	
  and	
  S.	
  Carlsson.	
  Recognizing	
  and	
  tracking	
  human	
  ac5on.	
  In	
  ECCV	
  2002	
  
•    A.A.	
  Efros,	
  A.C.	
  Berg,	
  G.	
  Mori,	
  and	
  J.	
  Malik.	
  Recognizing	
  ac5on	
  at	
  a	
  distance.	
  In	
  ICCV	
  2003	
  
•    A.	
  Bobick	
  and	
  J.	
  Davis.	
  The	
  recogni5on	
  of	
  human	
  movement	
  using	
  temporal	
  templates.	
  IEEE	
  Trans.	
  
     PAMI,	
  23(3):257–267,	
  2001.	
  
•    L.	
  Zelnik-­‐Manor	
  and	
  M.	
  Irani.	
  Event-­‐based	
  video	
  analysis.	
  In	
  CVPR	
  2001	
  
•    E.	
  Shechtman	
  and	
  M.	
  Irani.	
  Space-­‐5me	
  behavior	
  based	
  correla5on.	
  In	
  CVPR	
  2005	
  
•    O.	
  Boiman	
  and	
  M.	
  Irani.	
  Detec5ng	
  irregulari5es	
  in	
  images	
  and	
  in	
  video.	
  In	
  Proc.	
  ICCV,	
  2005.	
  
•    M.	
  Blank,	
  L.	
  Gorelick,	
  E.	
  Shechtman,	
  M.	
  Irani,	
  and	
  R.	
  Basri.	
  Ac5ons	
  as	
  space-­‐5me	
  shapes.	
  In	
  Proc.	
  
     ICCV,	
  2005.	
  
•    Y.	
  Ke,	
  R.	
  Sukthankar,	
  and	
  M.	
  Hebert.	
  Efficient	
  Visual	
  Event	
  Detec5on	
  using	
  Volumetric	
  Features	
  .	
  In	
  
     Proc.	
  ICCV	
  2005.	
  
•    Y.	
  Ke,	
  R.	
  Sukthankar,	
  and	
  M.	
  Hebert.	
  Event	
  detec5on	
  in	
  crowded	
  videos.	
  In	
  Proc.	
  ICCV,	
  2007.	
  
•    I.	
  Laptev	
  and	
  P.	
  Pérez.	
  Retrieving	
  ac5ons	
  in	
  movies.	
  In	
  Proc.	
  ICCV	
  2007	
  
•    D.	
  Weinland	
  and	
  E.	
  Boyer.	
  Ac5on	
  recogni5on	
  using	
  exemplar-­‐based	
  embedding.	
  In	
  Proc.	
  CVPR,	
  
     2008.	
  
•    Z.	
  Lin,	
  Z.	
  Jiang,	
  and	
  L.	
  S.	
  Davis.	
  Recognizing	
  ac5ons	
  by	
  shape-­‐mo5on	
  prototype	
  trees.	
  In	
  Proc.	
  ICCV,	
  
     2009.	
  
•    I.	
  Laptev	
  and	
  T.	
  Lindeberg.	
  Space-­‐5me	
  interest	
  points.	
  In	
  Proc.	
  ICCV	
  2003.	
  
•    C.	
  Schuldt,	
  I.	
  Laptev,	
  and	
  B.	
  Caputo.	
  Recognizing	
  human	
  ac5ons:	
  A	
  local	
  svm	
  approach.	
  In	
  Proc.	
  ICPR,	
  2004.	
  
•    P.	
  Dollar,	
  V.	
  Rabaud,	
  G.	
  Coerell,	
  and	
  Serge	
  Belongie.	
  Behavior	
  recogni5on	
  via	
  sparse	
  spa5o-­‐temporal	
  features.	
  
     In	
  VS-­‐PETS,	
  2005.	
  
•    H.	
  Jhuang,	
  T.	
  Serre,	
  L.	
  Wolf	
  and	
  T.	
  Poggio.	
  A	
  Biologically	
  Inspired	
  System	
  for	
  Ac5on	
  Recogni5on.	
  In	
  Proc.	
  ICCV	
  
     2007	
  
•    P.	
  Scovanner,	
  S.	
  Ali,	
  and	
  M.	
  Shah,	
  A	
  3-­‐Dimensional	
  SIFT	
  descriptor	
  and	
  its	
  applica5on	
  to	
  ac5on	
  recogni5on,	
  
     ACM	
  MM	
  2007.	
  	
  
•    J.	
  C.	
  Niebles,	
  H.	
  Wang,	
  and	
  L.	
  Fei-­‐Fei.	
  Unsupervised	
  learning	
  of	
  human	
  ac5on	
  categories	
  using	
  spa5al-­‐
     temporal	
  words.	
  In	
  IJCV	
  2008.	
  
•    I.	
  Laptev,	
  M.	
  Marszalek,	
  C.	
  Schmid,	
  and	
  B.	
  Rozenfeld.	
  Learning	
  realis5c	
  human	
  ac5ons	
  from	
  movies.	
  In	
  Proc.	
  
     CVPR	
  2008.	
  
•    A.	
  Klaeser,	
  M.	
  Marszałek	
  and	
  C.	
  Schmid.	
  A	
  spa5o-­‐temporal	
  descriptor	
  based	
  on	
  3D-­‐gradients.	
  In	
  Proc.	
  BMVC	
  
     2008	
  
•    G.	
  Willems,	
  T.	
  Tuytelaars	
  and	
  L.	
  Van	
  Gool.	
  An	
  Efficient	
  Dense	
  and	
  Scale-­‐Invariant	
  Spa5o-­‐Temporal	
  Interest	
  
     Point	
  Detector.	
  In	
  Proc.	
  ECCV	
  2008	
  
•    H.	
  Wang,	
  M.	
  M.	
  Ullah,	
  A.	
  Kläser,	
  I.	
  Laptev	
  and	
  C.	
  Schmid.	
  Evalua5on	
  of	
  local	
  spa5o-­‐temporal	
  features	
  for	
  
     ac5on	
  recogni5on.	
  In	
  Proc.	
  BMVC	
  2009.	
  
•    L.	
  Yeffet	
  and	
  L.	
  Wolf.	
  Local	
  Trinary	
  Paeerns	
  for	
  Human	
  Ac5on	
  Recogni5on.	
  In	
  Proc.	
  ICCV	
  2009.	
  	
  
•    A.	
  Gilbert,	
  J.	
  Illingworth,	
  R.	
  Bowden.	
  Fast	
  realis5c	
  mul5-­‐ac5on	
  recogni5on	
  using	
  mined	
  dense	
  spa5o-­‐
     temporal	
  features,	
  In	
  Proc.	
  ICCV	
  2009.	
  
•    P.	
  Ma5kainen,	
  M.	
  Hebert,	
  R.	
  Sukthankar.	
  Trajectons:	
  Ac5on	
  recogni5on	
  through	
  the	
  mo5on	
  analysis	
  of	
  
     tracked	
  features.	
  ICCV	
  workshop	
  on	
  Video-­‐oriented	
  Object	
  and	
  Event	
  Classifica5on,	
  2009	
  
•    M.	
  M.	
  Ullah,	
  S.	
  N.	
  Parizi,	
  I.	
  Laptev.	
  Improving	
  bag-­‐of-­‐features	
  ac5on	
  recogni5on	
  with	
  non-­‐local	
  cues.	
  In	
  Proc.	
  
     BMVC	
  2010	
  
•    Y.	
  Song,	
  L.	
  Goncalves,	
  and	
  P.	
  Perona.	
  Unsupervised	
  learning	
  of	
  human	
  mo5on.	
  IEEE	
  Trans.	
  PAMI,	
  25
     (7):814–827,	
  2003.	
  
•    D.	
  Ramanan	
  and	
  D.	
  A.	
  Forsyth.	
  Automa5c	
  annota5on	
  of	
  everyday	
  movements.	
  In	
  Advances	
  in	
  
     Neural	
  Informa5on	
  Processing	
  Systems	
  16,	
  2003.	
  
•    V.	
  Ferrari,	
  M.	
  Marin,	
  and	
  A.	
  Zisserman.	
  Pose	
  search:	
  retrieving	
  people	
  using	
  their	
  pose.	
  In	
  Proc.	
  IEEE	
  
     Comput.	
  Soc.	
  Conf.	
  Comput.	
  Vision	
  and	
  Paeern	
  Recogn.,	
  2009.	
  
•    Yang	
  Wang,	
  Hao	
  Jiang,	
  Mark	
  S.	
  Drew,	
  Ze-­‐Nian	
  Li,	
  and	
  Greg	
  Mori.	
  Unsupervised	
  discovery	
  of	
  ac5on	
  
     classes.	
  In	
  CVPR,	
  2006.	
  
•    Nazli	
  Ikizler-­‐Cinbis,	
  R.	
  Gokberk	
  Cinbis,	
  and	
  Stan	
  Sclaroff.	
  Learning	
  ac5ons	
  from	
  the	
  web.	
  In	
  IEEE	
  
     Interna5onal	
  Conference	
  on	
  Computer	
  Vision,	
  2009.	
  
•    Weilong	
  Yang,	
  Yang	
  Wang,	
  and	
  Greg	
  Mori.	
  Recognizing	
  human	
  ac5ons	
  from	
  s5ll	
  images	
  with	
  latent	
  
     poses.	
  In	
  Proc.	
  IEEE	
  Comput.	
  Soc.	
  Conf.	
  Comput.	
  Vision	
  and	
  Paeern	
  Recogn.,	
  2010.	
  
•    Bangpeng	
  Yao	
  and	
  Li	
  Fei-­‐Fei.	
  Modeling	
  mutual	
  context	
  of	
  object	
  and	
  human	
  pose	
  in	
  human-­‐object	
  
     interac5on	
  ac5vi5es.	
  In	
  Proc.	
  IEEE	
  Comput.	
  Soc.	
  Conf.	
  Comput.	
  Vision	
  and	
  Paeern	
  Recogn.,	
  2010.	
  
•    R.	
  Polana	
  and	
  R.C.	
  Nelson.	
  Detec5on	
  and	
  recogni5on	
  of	
  periodic,	
  nonrigid	
  mo5on.	
  In	
  IJCV	
  1997.	
  
•    S.M.	
  Seitz	
  and	
  C.R.	
  Dyer.	
  View	
  invariant	
  analysis	
  of	
  cyclic	
  mo5on.	
  In	
  IJCV	
  1997	
  
•    A.	
  Thangali	
  and	
  S.	
  Sclaroff.	
  Periodic	
  mo5on	
  detec5on	
  and	
  es5ma5on	
  via	
  space-­‐5me	
  sampling.	
  In	
  
     IEEE	
  Workshop	
  on	
  Mo5on	
  and	
  Video	
  Compu5ng,	
  2005.	
  
•    I.	
  Laptev,	
  S.J.	
  Belongie,	
  P.	
  Pérez	
  and	
  J.	
  Wills.	
  Periodic	
  mo5on	
  detec5on	
  and	
  segmenta5on	
  via	
  
     approximate	
  sequence	
  alignment,	
  In	
  Proc.	
  ICCV	
  2005	
  
•    P.	
  Wang,	
  G.D.	
  Abowd	
  and	
  J.M.	
  Rehg.	
  Quasi-­‐periodic	
  event	
  analysis	
  for	
  social	
  game	
  retrieval.	
  In	
  Proc	
  
     ICCV	
  2009	
  




•    D.	
  Weinland,	
  E.	
  Boyer,	
  and	
  R.	
  Ronfard.	
  Ac5on	
  recogni5on	
  from	
  arbitrary	
  views	
  using	
  3D	
  exemplars.	
  
     in	
  Proc.	
  ICCV	
  2007.	
  
•    A.	
  Farhadi	
  and	
  M.	
  Tabrizi.	
  Learning	
  to	
  recognize	
  ac5vi5es	
  from	
  the	
  wrong	
  view	
  point.	
  In	
  Proc.	
  ECCV	
  
     2008.	
  
•    I.	
  Junejo,	
  E.	
  Dexter,	
  I.	
  Laptev	
  and	
  Patrick	
  Pérez.	
  Cross-­‐view	
  ac5on	
  recogni5on	
  from	
  temporal	
  self-­‐
     similari5es.	
  In	
  Proc.	
  ECCV	
  2008	
  
•    A.	
  Farhadi,	
  M.	
  Kamali,	
  I.	
  Endres,	
  D.	
  Forsyth.	
  A	
  latent	
  model	
  of	
  discrimina5ve	
  aspect.	
  In	
  Proc.	
  ICCV	
  
     2009.	
  
•    X.	
  Wang,	
  X.	
  Ma,	
  and	
  E.	
  Grimson.	
  Unsupervised	
  ac5vity	
  percep5on	
  in	
  crowded	
  and	
  complicated	
  
     scenes	
  using	
  hierarchical	
  bayesian	
  models.	
  IEEE	
  Trans.	
  PAMI,	
  31(3):539–	
  555,	
  2009.	
  
•    Abhinav	
  Gupta,	
  Praveen	
  Srinivasan,	
  Jianbo	
  Shi,	
  and	
  Larry	
  S.	
  Davis.	
  Understanding	
  videos,	
  
     construc5ng	
  plots	
  -­‐	
  learning	
  a	
  visually	
  grounded	
  storyline	
  model	
  from	
  annotated	
  videos.	
  In	
  CVPR,	
  
     2009.	
  
•    T.	
  Xiang	
  and	
  S.	
  Gong.	
  Beyond	
  tracking:	
  Modelling	
  ac5vity	
  and	
  understanding	
  behaviour.	
  Int.	
  Journal	
  
     of	
  Computer	
  Vision,	
  67(1):21–51,	
  2006.	
  
•    G.	
  Medioni,	
  I.	
  Cohen,	
  F.	
  Bre	
  ́mond,	
  S.	
  Hongeng,	
  and	
  R.	
  Neva5a.	
  Event	
  detec5on	
  and	
  analysis	
  from	
  
     video	
  streams.	
  IEEE	
  Trans.	
  PAMI,	
  23(8):873–889,	
  2001.	
  
•    Y.	
  A.	
  Ivanov	
  and	
  A.	
  F.	
  Bobick.	
  Recogni5on	
  of	
  visual	
  ac5vi5es	
  and	
  interac5ons	
  by	
  stochas5c	
  parsing.	
  
     IEEE	
  Trans.	
  PAMI,	
  22(8):852–872,	
  2000.	
  
•    D.	
  Moore	
  and	
  I.	
  Essa.	
  Recognizing	
  mul5tasked	
  ac5vi5es	
  using	
  stochas5c	
  context-­‐free	
  grammar	
  using	
  
     video.	
  In	
  AAAI,	
  2002.	
  
•    Chen	
  Change	
  Loy,	
  Tao	
  Xiang,	
  and	
  Shaogang	
  Gong.	
  Modelling	
  ac5vity	
  global	
  temporal	
  dependencies	
  
     using	
  5me	
  delayed	
  probabilis5c	
  graphical	
  model.	
  In	
  ICCV,	
  2009.	
  
•    Xiaogang	
  Wang,	
  Keng	
  Teck	
  Ma,	
  Gee	
  Wah	
  Ng,	
  and	
  W.	
  Eric	
  L.	
  Grimson.	
  Trajectory	
  analysis	
  and	
  
     seman5c	
  region	
  modeling	
  using	
  a	
  nonparametric	
  bayesian	
  model.	
  In	
  Proc.	
  IEEE	
  Comput.	
  Soc.	
  Conf.	
  
     Comput.	
  Vision	
  and	
  Paeern	
  Recogn.,	
  2008.	
  
•    W.	
  Choi,	
  K.	
  Shahid,	
  and	
  S.	
  Savarese.	
  ”what	
  are	
  they	
  doing?	
  :	
  Collec5ve	
  ac5vity	
  classifica5on	
  using	
  
     spa5o-­‐temporal	
  rela5onship	
  among	
  people”.	
  In	
  9th	
  Interna5onal	
  Workshop	
  on	
  Visual	
  Surveillance,	
  
     2009.	
  
•    Ramin	
  Mehran,	
  Alexis	
  Oyama,	
  and	
  Mubarak	
  Shah.	
  Abnormal	
  crowd	
  behavior	
  detec5on	
  using	
  social	
  
     force	
  model.	
  In	
  CVPR,	
  2009.	
  
Workshop	
  materials	
  available:	
  
heps://sites.google.com/site/
humanac5onstutorialeccv10/	
  

Contenu connexe

Plus de zukun

Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...zukun
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learningzukun
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...zukun
 
Lecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningLecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningzukun
 
Lecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognitionLecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognitionzukun
 
P05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for visionP05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for visionzukun
 

Plus de zukun (20)

Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
 
Quoc le tera-scale deep learning
Quoc le   tera-scale deep learningQuoc le   tera-scale deep learning
Quoc le tera-scale deep learning
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
 
Lecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learningLecun 20060816-ciar-01-energy based learning
Lecun 20060816-ciar-01-energy based learning
 
Lecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognitionLecun 20060816-ciar-02-deep learning for generic object recognition
Lecun 20060816-ciar-02-deep learning for generic object recognition
 
P05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for visionP05 deep boltzmann machines cvpr2012 deep learning methods for vision
P05 deep boltzmann machines cvpr2012 deep learning methods for vision
 

Dernier

Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 

Dernier (20)

Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 

Pose-Based Structural Methods for Action Recognition

  • 2.
  • 3. G. Johansson, Moving Light Displays, 1973 •  Pose  seems  sufficient  for  certain  ac5on  categories   •  Remove  effects  of  clothing,  ligh5ng  varia5on  from  representa5on  
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. •  Two  broad  classes  of  approaches   –  Match  templates  (exemplar-­‐based)   –  Fit  a  human  body  model  
  • 9. Mori & Malik PAMI 2005 Shakhnarovich, Viola & Darrell ICCV 2003 Bourdev & Malik ICCV 2009 Database of Exemplars
  • 10. Pictorial Structures model Felzenszwalb & Huttenlocher CVPR 2000 Ramanan NIPS 2006 Ferrari, Marin & Zisserman CVPR 2008
  • 11. •  Detect  corners  in  images/video   •  Assess  likelihood  under  ac5on-­‐specific  pose  model   •  Discriminate  between  walking  direc5ons,  bicycle  riding   Song, Goncalves & Perona NIPS 2001, PAMI 2003
  • 12. test sequence key frame •  Key  frame  matching  to  test  sequence  to  find   similar  poses   –  Shape  matching  on  edge  maps  using  order   structure   Sullivan & Carlsson ECCV 2002
  • 13. •  M  is  quan5zed  3d  pose   •  T  is  root  orienta5on     •  Automa5c  person  detec5on-­‐tracking   •  Compare  quan5zed  pose  to  labeled  training  poses   –  Smooth  over  5me   Ramanan & Forsyth NIPS 2003
  • 14. •  Video  shot  retrieval  from  pose   –  Either  query-­‐by-­‐example  or  classifica5on   –  Focus  on  upper  body  pose   •  Pictorial  structures  model   Ac5on   Label   Ferrari, Marin & Zisserman CVPR 2009
  • 15. query CODE AVAILABLE ONLINE •  SVM  on  descriptors  of  absolute  &  rela5ve  part  loca5ons,   segmenta5ons   –  Include  short  tracks  for  robustness  
  • 16. Golfing?   Walking?   •  Focus  on  discrimina5ve  elements  of  pose  for  ac5on  classifica5on   •  Use  exemplar-­‐based  “poselet”  representa5on   Yang, Wang & Mori CVPR 2010
  • 18. A: Activity A Tennis Croquet Volleyball Human pose forehand shot smash H O: Object O Tennis Croquet Volleyball Body parts racket mallet P1 P2 PN H: fO f1 f2 fN Intra-class variations •  More than one H for each A; Image evidence •  Unobserved during training. P: lP: location; θP: orientation; sP: scale. f: Shape context. [Belongie et al, 2002] Yao & Fei-Fei CVPR 2010
  • 20. •  Build  ac5on  models  from  web  search  results   Ikizler-Cinbis, Cinbis, Sclaroff ICCV 2009
  • 21. SLAG •  Find  repeated  poses  in  a  dataset   Wang, Jiang, Drew, Li, Mori CVPR 2006
  • 22. •  Person  loca5on   Riding horse Reading book Taking photo given   •  Classify  into  one  of  9   categories   Riding bike Play instrument Running Phoning Use computer Walking
  • 23. •  Pose  as  representa5on  for  ac5on  recogni5on   –  Captures  much  informa5on  about  ac5on   –  Invariance  to  clothing  /  ligh5ng  effects   –  Model  and  exemplar  based  representa5ons   •  New  direc5on:  Ac5on  recogni5on  from  s5ll   images   –  Image  retrieval  and  analysis   –  An  important  cue  for  video-­‐based  ac5on  recogni5on   –  Pose  seems  essen5al  
  • 24.
  • 25.
  • 26.
  • 27. •  Describe  low-­‐level  components   –  Ac5ons  of  individual  people   –  Movement  of  pixels   •  Iden5fy  key  objects  or  loca5ons  in  scene   –  Buildings,  roads,  etc.   •  Model  interac5ons  between  people,  objects,   and  loca5ons  
  • 28. •  Detect  and  track  moving   objects   •  Manually  iden5fy  key   regions  in  scene   –  E.g.  road,  checkpoint   •  Scenarios  describe   rela5ve  arrangements  of   objects  in  scene   –  E.g.  proximity  of  car  to   checkpoint   –  No5ons  of  scene  context   Medioni, Cohen, Bremond, Hongeng, Nevatia PAMI 2001
  • 29. •  Detect  and  track  players,   ball   •  Low-­‐level  ac5on   detectors  for  individual   players   •  Hand-­‐constructed  Bayes   net  for  each  ac5vity   –  Spa5al  and  temporal   rela5ons  between  low-­‐ level  ac5ons   Intille & Bobick CVPR 1999
  • 30. •  Global,  frame-­‐level  feature   –  Bag-­‐of-­‐words  representa5on   •  Detect  unusual  events  by  clustering   –  Isolated,  varied  clusters  are  unusual   Zhong, Shi & Visontai CVPR 2004
  • 31. •  Chea5ng  detec5on  in   •  Real-­‐world  highway  dataset   simulated  card  game   –  Cars  pulling  off  road,  backing   up,  U-­‐turns  
  • 32. •  Describe  moving  pixels   by  loca5on  and  mo5on   direc5on   –  No  object  detec5on   •  Use  as  visual  words  in   Latent  Dirichlet   Alloca5on  (LDA)  type   model   –  Infer  low-­‐level  ac5ons   from  words   Wang, Ma, Grimson PAMI 2009 Blei, Ng, Jordan JMLR 2003
  • 33. •  Higher-­‐level  ac5vity  analysis   –  Distribu5on  of  low-­‐level  ac5ons  over  en5re  scene   •  Applica5ons   –  Temporal  segmenta5on  by  ac5vity   –  Abnormality  detec5on  
  • 34. 0.8 A   B   0.3 0.5 C   •  Hierarchical  Dirichlet   Process  model   –  Learn  number  of  ac5vi5es   automa5cally   Kuettel, Breitenstein, van Gool & Ferrari CVPR 2010
  • 35. traffic  light  controlled  scene   •  con5nuous  video   •  annotated  with  states  and  history   •  3x  speed  
  • 36. Loy, Xiang & Gong CVPR, ICCV 2009
  • 37. •  Consider  5me-­‐delayed  correla5ons  between  regions   –  Applica5ons  to  irregularity  detec5on  
  • 38. y   activity class h1 h2 y   h action class … x1 x2 xn image x0 Choi, Shahid, & Savarese VS 2009 Lan, Wang, Yang, & Mori SGA 2010, NIPS 2010
  • 39. •  Cap5oned  baseball  videos   in  training   •  Build  AND-­‐OR  graph   representa5on  of  ac5vi5es   –  AND  specifies  elements  of   an  ac5vity  that  must  occur   –  OR  allows  varia5on  in  how   an  element  appears   •  Describe  low-­‐level  tracks   using  STIPs   •  Match  tracks  to  ac5ons  in   AND-­‐OR  graph   Gupta, Srinivasan, Shi, Davis CVPR 2009
  • 40. •  Scene  modeling  to  look  at  the  big  picture   •  Feature  representa5ons   –  Holis5c:  describe  en5re  scene,  irrespec5ve  of   individuals   –  Local:  describe  ac5ons  of  individuals   •  Structure  of  ac5vi5es   –  Model  free:  clustering-­‐type  approaches   –  Strong  models:  grammars,  probabilis5c  models  
  • 41.
  • 42.
  • 43. Objects: Actions: cars, glasses, drinking, running, people, etc… door exit, car enter constraints Scene categories: Geometry: indoors, outdoors, street Street, wall, scene, etc… field, stair, etc…
  • 44. Riding horse Reading book A 0.8 B 0.3 C 0.5 Riding bike Play instrument
  • 45. •  Standardiza5on  of  datasets  for  field   –  Allow  comparison  of  algorithms   •  E.g.  KTH  for  low-­‐level  features,  atomic  ac5ons   –  Fair  tuning  of  model  parameters   •  New  algorithms  compare  to  baselines   –  Bag-­‐of-­‐words  on  densely  sampled  STIPs   –  Pose  es5ma5on  (Ferrari  et  al.  code)   –  HOG  SVM  (Dalal  &  Triggs  code,  Ramanan  code)  
  • 46. •  Standardiza5on  of  datasets  for  field   –  Don’t  feel  constrained  by  the  exis5ng  problem   defini5ons   –  Do  make  your  new  dataset  available   •  Should  clearly  specify  separate  training  and  test  sets   •  New  algorithms  compare  to  baselines   –  Do  use  reasonable  variant  of  standard  baselines   for  your  new  problem  
  • 47.
  • 48. •  Even  atomic  low-­‐level  ac5ons  are  very  difficult   to  detect  reliably   –  Far  more  work  needed  on  representa5ons  for  the   ac5on  of  a  single  person   –  Features   –  Temporal  representa5on,  smoothing   –  Tracking   –  …  
  • 49. 1.  Cameras and bandwidth are cheap 2.  Lots of training data is potentially available Training + = data Potential for the huge progress … if we can get the data
  • 50. Aligned with video   Describes visual Source   content   Subtitles   Yes No   DVD, Internet Scripts for No   Yes   Internet, e.g. TV series, movies and www.dailyscript.com   sport games Plot summaries and No   Yes, sparsely   Internet synopses (e.g. IMDB)   Instruction videos No   Yes   Internet, e.g. www.videojug.com   Descriptive Video Service Yes   Yes   DVD, rare   Word tags No   Yes, sparsely   Internet (e.g. YouTube)   Manual labelling, ??   ??   Mechanical Turk, Human Computation ESP Game, Grad undergrad students
  • 51. Open questions:   How to benefit from the structure of the human body in complex situations, e.g. heavy occlusions, uniformly colored clothing?   Will action classification generalize over different video domains: Movies, TV, YouTube, Surveillance video?   What is the useful action vocabulary? Are we trying to solve the right problem? How can we visualize/display the results? Interesting novel directions:   Use actions for recognizing functional and physical object properties, e.g. “sitable”, “eatable”, “heavy”, “solid” objects…   Action prediction, i.e. what can happen in the given situation: e.g. is it dangerous to cross this road?   Explore more sources of strong and weak supervision: Manual surveillance, Descriptive Video Service (DVS), YouTube tags; Transcripts of sports games; Instruction videos.
  • 52. •  P.  Viola,  M.  Jones,  and  D.  Snow.  Detec5ng  pedestrians  using  paeerns  of  mo5on  and  appearance.  In   Proc.  9th  Int.  Conf.  Computer  Vision,  pages  734–741,  2003.   •  N.  Dalal  and  B.  Triggs.  Histograms  of  oriented  gradients  for  human  detec5on.  In  Proc.  IEEE  Comput.   Soc.  Conf.  Comput.  Vision  and  Paeern  Recogn.,  2005.   •  Bo  Wu  and  Ram  Neva5a.  Detec5on  of  mul5ple,  par5ally  occluded  humans  in  a  single  image  by   bayesian  combina5on  of  edgelet  part  detectors.  In  Proc.  10th  Int.  Conf.  Computer  Vision,  2005.   •  Pedro  Felzenszwalb,  David  McAllester,  and  Deva  Ramanan.  A  discrimina5vely  trained,  mul5scale,   deformable  part  model.  In  IEEE  Computer  Society  Conference  on  Computer  Vision  and  Paeern   Recogni5on,  2008.   •  Chris  Stauffer  and  W.E.L.  Grimson.  Adap5ve  background  mixture  models  for  real-­‐  5me  tracking.  In   Proc.  IEEE  Comput.  Soc.  Conf.  Comput.  Vision  and  Paeern  Recogn.,  1999.   •  Kentaro  Toyama,  John  Krumm,  Barry  Brumie,  and  Brian  Meyers.  Wallflower:  Principles  and  prac5ce   of  background  maintenance.  In  Proc.  7th  Int.  Conf.  Computer  Vision,  1999.   •  J.L.  Barron,  D.J.  Fleet,  and  S.S.  Beauchemin.  Performance  of  op5cal  flow  techniques.  Int.  Journal  of   Computer  Vision,  12(1):43–77,  1994.   •  T.  Brox,  C.  Bregler,  and  J.  Malik.  Large  displacement  op5cal  flow.  In  Proc.  IEEE  Comput.  Soc.  Conf.   Comput.  Vision  and  Paeern  Recogn.,  2009.   •  M.  Isard  and  A.  Blake.  Condensa5on  –  condi5onal  density  propaga5on  for  visual  tracking.  Int.   Journal  of  Computer  Vision,  29(1):5–28,  1998.   •  Yuan  Li,  Chang  Huang,  and  Ram  Neva5a.  Learning  to  associate:  Hybridboosted  mul5-­‐target  tracker   for  crowded  scene.  In  Proc.  IEEE  Comput.  Soc.  Conf.  Comput.  Vision  and  Paeern  Recogn.,  2009.  
  • 53. •  W.  T.  Freeman,  K.  Tanaka,  J.Ohta,  and  K.  Kyuma.  Computer  vision  for  computer  games.  In  IEEE  2nd   Intl.  Conf.  on  Automa5c  Face  and  Gesture  Recogni5on,  1996.   •  J.  Sullivan  and  S.  Carlsson.  Recognizing  and  tracking  human  ac5on.  In  ECCV  2002   •  A.A.  Efros,  A.C.  Berg,  G.  Mori,  and  J.  Malik.  Recognizing  ac5on  at  a  distance.  In  ICCV  2003   •  A.  Bobick  and  J.  Davis.  The  recogni5on  of  human  movement  using  temporal  templates.  IEEE  Trans.   PAMI,  23(3):257–267,  2001.   •  L.  Zelnik-­‐Manor  and  M.  Irani.  Event-­‐based  video  analysis.  In  CVPR  2001   •  E.  Shechtman  and  M.  Irani.  Space-­‐5me  behavior  based  correla5on.  In  CVPR  2005   •  O.  Boiman  and  M.  Irani.  Detec5ng  irregulari5es  in  images  and  in  video.  In  Proc.  ICCV,  2005.   •  M.  Blank,  L.  Gorelick,  E.  Shechtman,  M.  Irani,  and  R.  Basri.  Ac5ons  as  space-­‐5me  shapes.  In  Proc.   ICCV,  2005.   •  Y.  Ke,  R.  Sukthankar,  and  M.  Hebert.  Efficient  Visual  Event  Detec5on  using  Volumetric  Features  .  In   Proc.  ICCV  2005.   •  Y.  Ke,  R.  Sukthankar,  and  M.  Hebert.  Event  detec5on  in  crowded  videos.  In  Proc.  ICCV,  2007.   •  I.  Laptev  and  P.  Pérez.  Retrieving  ac5ons  in  movies.  In  Proc.  ICCV  2007   •  D.  Weinland  and  E.  Boyer.  Ac5on  recogni5on  using  exemplar-­‐based  embedding.  In  Proc.  CVPR,   2008.   •  Z.  Lin,  Z.  Jiang,  and  L.  S.  Davis.  Recognizing  ac5ons  by  shape-­‐mo5on  prototype  trees.  In  Proc.  ICCV,   2009.  
  • 54. •  I.  Laptev  and  T.  Lindeberg.  Space-­‐5me  interest  points.  In  Proc.  ICCV  2003.   •  C.  Schuldt,  I.  Laptev,  and  B.  Caputo.  Recognizing  human  ac5ons:  A  local  svm  approach.  In  Proc.  ICPR,  2004.   •  P.  Dollar,  V.  Rabaud,  G.  Coerell,  and  Serge  Belongie.  Behavior  recogni5on  via  sparse  spa5o-­‐temporal  features.   In  VS-­‐PETS,  2005.   •  H.  Jhuang,  T.  Serre,  L.  Wolf  and  T.  Poggio.  A  Biologically  Inspired  System  for  Ac5on  Recogni5on.  In  Proc.  ICCV   2007   •  P.  Scovanner,  S.  Ali,  and  M.  Shah,  A  3-­‐Dimensional  SIFT  descriptor  and  its  applica5on  to  ac5on  recogni5on,   ACM  MM  2007.     •  J.  C.  Niebles,  H.  Wang,  and  L.  Fei-­‐Fei.  Unsupervised  learning  of  human  ac5on  categories  using  spa5al-­‐ temporal  words.  In  IJCV  2008.   •  I.  Laptev,  M.  Marszalek,  C.  Schmid,  and  B.  Rozenfeld.  Learning  realis5c  human  ac5ons  from  movies.  In  Proc.   CVPR  2008.   •  A.  Klaeser,  M.  Marszałek  and  C.  Schmid.  A  spa5o-­‐temporal  descriptor  based  on  3D-­‐gradients.  In  Proc.  BMVC   2008   •  G.  Willems,  T.  Tuytelaars  and  L.  Van  Gool.  An  Efficient  Dense  and  Scale-­‐Invariant  Spa5o-­‐Temporal  Interest   Point  Detector.  In  Proc.  ECCV  2008   •  H.  Wang,  M.  M.  Ullah,  A.  Kläser,  I.  Laptev  and  C.  Schmid.  Evalua5on  of  local  spa5o-­‐temporal  features  for   ac5on  recogni5on.  In  Proc.  BMVC  2009.   •  L.  Yeffet  and  L.  Wolf.  Local  Trinary  Paeerns  for  Human  Ac5on  Recogni5on.  In  Proc.  ICCV  2009.     •  A.  Gilbert,  J.  Illingworth,  R.  Bowden.  Fast  realis5c  mul5-­‐ac5on  recogni5on  using  mined  dense  spa5o-­‐ temporal  features,  In  Proc.  ICCV  2009.   •  P.  Ma5kainen,  M.  Hebert,  R.  Sukthankar.  Trajectons:  Ac5on  recogni5on  through  the  mo5on  analysis  of   tracked  features.  ICCV  workshop  on  Video-­‐oriented  Object  and  Event  Classifica5on,  2009   •  M.  M.  Ullah,  S.  N.  Parizi,  I.  Laptev.  Improving  bag-­‐of-­‐features  ac5on  recogni5on  with  non-­‐local  cues.  In  Proc.   BMVC  2010  
  • 55. •  Y.  Song,  L.  Goncalves,  and  P.  Perona.  Unsupervised  learning  of  human  mo5on.  IEEE  Trans.  PAMI,  25 (7):814–827,  2003.   •  D.  Ramanan  and  D.  A.  Forsyth.  Automa5c  annota5on  of  everyday  movements.  In  Advances  in   Neural  Informa5on  Processing  Systems  16,  2003.   •  V.  Ferrari,  M.  Marin,  and  A.  Zisserman.  Pose  search:  retrieving  people  using  their  pose.  In  Proc.  IEEE   Comput.  Soc.  Conf.  Comput.  Vision  and  Paeern  Recogn.,  2009.   •  Yang  Wang,  Hao  Jiang,  Mark  S.  Drew,  Ze-­‐Nian  Li,  and  Greg  Mori.  Unsupervised  discovery  of  ac5on   classes.  In  CVPR,  2006.   •  Nazli  Ikizler-­‐Cinbis,  R.  Gokberk  Cinbis,  and  Stan  Sclaroff.  Learning  ac5ons  from  the  web.  In  IEEE   Interna5onal  Conference  on  Computer  Vision,  2009.   •  Weilong  Yang,  Yang  Wang,  and  Greg  Mori.  Recognizing  human  ac5ons  from  s5ll  images  with  latent   poses.  In  Proc.  IEEE  Comput.  Soc.  Conf.  Comput.  Vision  and  Paeern  Recogn.,  2010.   •  Bangpeng  Yao  and  Li  Fei-­‐Fei.  Modeling  mutual  context  of  object  and  human  pose  in  human-­‐object   interac5on  ac5vi5es.  In  Proc.  IEEE  Comput.  Soc.  Conf.  Comput.  Vision  and  Paeern  Recogn.,  2010.  
  • 56. •  R.  Polana  and  R.C.  Nelson.  Detec5on  and  recogni5on  of  periodic,  nonrigid  mo5on.  In  IJCV  1997.   •  S.M.  Seitz  and  C.R.  Dyer.  View  invariant  analysis  of  cyclic  mo5on.  In  IJCV  1997   •  A.  Thangali  and  S.  Sclaroff.  Periodic  mo5on  detec5on  and  es5ma5on  via  space-­‐5me  sampling.  In   IEEE  Workshop  on  Mo5on  and  Video  Compu5ng,  2005.   •  I.  Laptev,  S.J.  Belongie,  P.  Pérez  and  J.  Wills.  Periodic  mo5on  detec5on  and  segmenta5on  via   approximate  sequence  alignment,  In  Proc.  ICCV  2005   •  P.  Wang,  G.D.  Abowd  and  J.M.  Rehg.  Quasi-­‐periodic  event  analysis  for  social  game  retrieval.  In  Proc   ICCV  2009   •  D.  Weinland,  E.  Boyer,  and  R.  Ronfard.  Ac5on  recogni5on  from  arbitrary  views  using  3D  exemplars.   in  Proc.  ICCV  2007.   •  A.  Farhadi  and  M.  Tabrizi.  Learning  to  recognize  ac5vi5es  from  the  wrong  view  point.  In  Proc.  ECCV   2008.   •  I.  Junejo,  E.  Dexter,  I.  Laptev  and  Patrick  Pérez.  Cross-­‐view  ac5on  recogni5on  from  temporal  self-­‐ similari5es.  In  Proc.  ECCV  2008   •  A.  Farhadi,  M.  Kamali,  I.  Endres,  D.  Forsyth.  A  latent  model  of  discrimina5ve  aspect.  In  Proc.  ICCV   2009.  
  • 57. •  X.  Wang,  X.  Ma,  and  E.  Grimson.  Unsupervised  ac5vity  percep5on  in  crowded  and  complicated   scenes  using  hierarchical  bayesian  models.  IEEE  Trans.  PAMI,  31(3):539–  555,  2009.   •  Abhinav  Gupta,  Praveen  Srinivasan,  Jianbo  Shi,  and  Larry  S.  Davis.  Understanding  videos,   construc5ng  plots  -­‐  learning  a  visually  grounded  storyline  model  from  annotated  videos.  In  CVPR,   2009.   •  T.  Xiang  and  S.  Gong.  Beyond  tracking:  Modelling  ac5vity  and  understanding  behaviour.  Int.  Journal   of  Computer  Vision,  67(1):21–51,  2006.   •  G.  Medioni,  I.  Cohen,  F.  Bre  ́mond,  S.  Hongeng,  and  R.  Neva5a.  Event  detec5on  and  analysis  from   video  streams.  IEEE  Trans.  PAMI,  23(8):873–889,  2001.   •  Y.  A.  Ivanov  and  A.  F.  Bobick.  Recogni5on  of  visual  ac5vi5es  and  interac5ons  by  stochas5c  parsing.   IEEE  Trans.  PAMI,  22(8):852–872,  2000.   •  D.  Moore  and  I.  Essa.  Recognizing  mul5tasked  ac5vi5es  using  stochas5c  context-­‐free  grammar  using   video.  In  AAAI,  2002.   •  Chen  Change  Loy,  Tao  Xiang,  and  Shaogang  Gong.  Modelling  ac5vity  global  temporal  dependencies   using  5me  delayed  probabilis5c  graphical  model.  In  ICCV,  2009.   •  Xiaogang  Wang,  Keng  Teck  Ma,  Gee  Wah  Ng,  and  W.  Eric  L.  Grimson.  Trajectory  analysis  and   seman5c  region  modeling  using  a  nonparametric  bayesian  model.  In  Proc.  IEEE  Comput.  Soc.  Conf.   Comput.  Vision  and  Paeern  Recogn.,  2008.   •  W.  Choi,  K.  Shahid,  and  S.  Savarese.  ”what  are  they  doing?  :  Collec5ve  ac5vity  classifica5on  using   spa5o-­‐temporal  rela5onship  among  people”.  In  9th  Interna5onal  Workshop  on  Visual  Surveillance,   2009.   •  Ramin  Mehran,  Alexis  Oyama,  and  Mubarak  Shah.  Abnormal  crowd  behavior  detec5on  using  social   force  model.  In  CVPR,  2009.  
  • 58. Workshop  materials  available:   heps://sites.google.com/site/ humanac5onstutorialeccv10/