SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
Estimating Human Pose from
     Occluded Images
   Jia-Bin Huang and Ming-Hsuan Yang
      {jbhuang,mhyang}@ieee.org




     Electrical Engineering and Computer Science
            University of California at Merced


      ACCV, Xi’ an, September , 2009

                                                   1 / 25
What this talk is about
   Estimating human poses from single 2D images as a direct
   nonlinear regression
   Recovering poses even when humans in the scenes are partially
   or heavily occluded via sparse approximation
   Achieving (implicitly) relevant feature selection




                                                               2 / 25
Outline



1   Introduction


2   Recovering Poses via Sparse Approximation


3   Experimental Setups


4   Conclusions




                                                3 / 25
Introduction




               4 / 25
Motivation - Applications


   Motion Analysis
       Diagnostics of orthopedic patients
       Analysis and optimization of an athletes’ performances
       Content-based retrieval and compression of video
       Auto saftey: control of airbags, drowsiness detection, pedestrian
       detection, etc.
   Surveillance
       People counting or crowd flux, flow, and congestion analysis
       Analysis of actions, activities, and behaviors both for crowds and
       individuals
   Human Computer Interaction
       Virtual Reality




                                                                            5 / 25
Motivation - Applications


   Motion Analysis
       Diagnostics of orthopedic patients
       Analysis and optimization of an athletes’ performances
       Content-based retrieval and compression of video
       Auto saftey: control of airbags, drowsiness detection, pedestrian
       detection, etc.
   Surveillance
       People counting or crowd flux, flow, and congestion analysis
       Analysis of actions, activities, and behaviors both for crowds and
       individuals
   Human Computer Interaction
       Virtual Reality




                                                                            5 / 25
Motivation - Applications


   Motion Analysis
       Diagnostics of orthopedic patients
       Analysis and optimization of an athletes’ performances
       Content-based retrieval and compression of video
       Auto saftey: control of airbags, drowsiness detection, pedestrian
       detection, etc.
   Surveillance
       People counting or crowd flux, flow, and congestion analysis
       Analysis of actions, activities, and behaviors both for crowds and
       individuals
   Human Computer Interaction
       Virtual Reality




                                                                            5 / 25
Key Challenges
Why the problem is difficult?




      Ambiguity: Loss of depth information
      Variations of shape and appearance of human body
      Background clutters
      Occlusions
In this paper, we address the problems of occlusions and background
clutters.




                                                                  6 / 25
Related Works

Model-based (Generative)
    Employ a know model (e.g., tree structure) based on prior
    knowledge
    Include two parts: 1) Modeling, 2) Estimation
[Felzenszwalb, Huttenlocher ’00], [Ioeffe et al. ’01], [Ronfard et al. ’02],
[Ramnan, Forsyth ’03], [Mori, Malik, ’04]

Model-free (Discriminative)
    Example-based
    [Shakhnarovich et al. ’03], [Poppe ’07]
    Learning-based
    [Rosales, Sclaroff, ’01], [Agarwal, Triggs ’06], [Sminchisescu et al.
    ’07], [Bissacco et al. 07], [Okada, Soatto ’08]

                                                                          7 / 25
Recovering Poses via Sparse
      Approximation




                              8 / 25
Problem Formulation

   Given: N training samples {x1 , y1 }, {x2 , y2 } · · · , {xN , yN }
   Input: test sample b
   Output: pose descriptor vector y




        Input: single image                          Output: pose


                                                                         9 / 25
Test Image as a Sparse Linear Combination of
Training Images



   Given N training samples x1 , x2 , · · · , xN ∈ IRm , we represent a test
   sample b by
                     b = ω1 x1 + ω2 x2 + · · · + ωN xN
   Let A = [x1 , x2 , · · · , xN ] ∈ IRm×N ,

                                        b = Aω,

   where ω = [ω1 , ω2 , . . . , ωN ]T is the coefficient vector.
   We want to find the sparsest solution of the linear system.




                                                                          10 / 25
Goal: Find the sparsest solution
Recover the sparsest solution via       1 -norm   minimization




       0 -norm   solution–NP hard

                             min ω         0   subject to b = Aω
                             ω

       1 -norm   solution–convex optimization problem

                             min ω         1   subject to b = Aω
                             ω

      Relaxed constraint on equality to allow small noise

                        min ω       1     subject to       b − Aω   2   ≤
                         ω




                                                                            11 / 25
Coping with Background Clutter and Occlusions


   When errors occur? 1) Misalignment, 2) Background clutter, 3)
   Occlusions.
   Introduce an error term e
                                          ω
                     b = Aω + e = [A I]     = Bv
                                          e

   Solve the extended linear system

                min ||v||1 subject to     b − Bv   2   ≤

   Recovered test sample bR can be represented as Aω




                                                                   12 / 25
Occlusion Recovery: An Example




Figure: Occlusion recovery on a synthetic dataset. (a)(b) The original input
image and its feature. (c) Corrupted feature via adding random block. (d)
Recovered feature via find the sparsest solution. (e) Reconstruction error.




                                                                               13 / 25
Handling Background Clutter: An Example




Figure: Feature selection example. (a) Original test image. (b) The HOG
feature descriptor computed from (a). (c) Recovered feature vector by our
algorithm. (d) The reconstruction error.




                                                                            14 / 25
Experimental Results




                       15 / 25
Experimental Results

Settings
      1.8 GHz PC with 2 GB RAM
      Matlab implementation

 1   solver
       1   magic (second-order cone programming, SOCP)
      Sparse Lab
      Many other packages...

Regressor: Map image feature to pose vector
      Gaussian Process Regressor
      Relevant Vector Regressor [Agarwal, Triggs PAMI ’06]
      Support Vector Regressor

                                                             16 / 25
Experimental Results

Settings
      1.8 GHz PC with 2 GB RAM
      Matlab implementation

 1   solver
       1   magic (second-order cone programming, SOCP)
      Sparse Lab
      Many other packages...

Regressor: Map image feature to pose vector
      Gaussian Process Regressor
      Relevant Vector Regressor [Agarwal, Triggs PAMI ’06]
      Support Vector Regressor

                                                             16 / 25
Experimental Results

Settings
      1.8 GHz PC with 2 GB RAM
      Matlab implementation

 1   solver
       1   magic (second-order cone programming, SOCP)
      Sparse Lab
      Many other packages...

Regressor: Map image feature to pose vector
      Gaussian Process Regressor
      Relevant Vector Regressor [Agarwal, Triggs PAMI ’06]
      Support Vector Regressor

                                                             16 / 25
Robustness to Occlusions

Synthetic data [Agarwal, Triggs PAMI ’06]
    1927 silhouette image for training and 418 images for testing
    Image descriptor: PCA and downsampled images
    Pose vector: 55-dimensional vector describe the joint angle (in
    degree)

Sample test images




       (a) 0.1   (b) 0.2   (c) 0.3   (d) 0.4   (e) 0.5   (f) 0.6

                                                                      17 / 25
Robustness to Occlusions
Results on synthetic data set


      Estimating pose from original (blue), corrupted (green), and
      recovered (red) testing images




                    (a) PCA                    (b) Downsample
Figure: Average error of pose estimation on synthetic data set using different
features: (a) PCA with 20 coefficients. (b) downsampled (20×20) images.


Localized features are more suitable for error correction!
                                                                            18 / 25
Robustness to Occlusions


Real Database: HumanEva I [Sigal, Black TR ’06]
   Synchronized image and motion capture data
   Four views of four subjects
   Six predefined actions (walking, jogging, gesturing,
   throwing/catching, boxing, combo)
   We use the common motion walking sequences of three subjects
   from the first camera

Synthesized testing images
   For each image, we randomly generate two occluding blocks with
   various corruption level for synthesizing images with occlusions.



                                                                  19 / 25
Robustness to Occlusions


Real Database: HumanEva I [Sigal, Black TR ’06]
   Synchronized image and motion capture data
   Four views of four subjects
   Six predefined actions (walking, jogging, gesturing,
   throwing/catching, boxing, combo)
   We use the common motion walking sequences of three subjects
   from the first camera

Synthesized testing images
   For each image, we randomly generate two occluding blocks with
   various corruption level for synthesizing images with occlusions.



                                                                  19 / 25
Robustness to Occlusions
HumanEva I: Sample test images




     (a) 0.1     (b) 0.2     (c) 0.3   (d) 0.4   (e) 0.5   (f) 0.6



                                                                     20 / 25
Robustness to Occlusions
Results on HumanEva data set


     Estimating pose from original (blue), corrupted (green), and
     recovered (red) testing images




          (a) S1                  (b) S2                   (C) S3

Figure: Results of pose estimation on HumanEva data set I in walking
sequences. (a) Subject 1. (b) Subject 2. (c) Subject 3.



                                                                       21 / 25
Robustness to Background Clutters
Implicit relevant feature selection

      Compared with the original feature representation (HOG), the
      estimations induced from recovered feature are better.
      The improvements of mean position errors (mm) are 4.89, 10.84,
      and 7.87 for S1, S2, and S3, respectively




          Figure: Mean 3D error plots for the walking sequences (S2).

                                                                        22 / 25
Conclusions




              23 / 25
Conclusions



   Spare approximation can be used for recovering errors resulted
   from occlusions when estimating humans from occluded images
   Without occlusions, the recovered features are still better than the
   original ones
   The recovery of feature representation is independent from the
   learning process, i.e., our method can be adopted as a
   preprocessing module of other model-free pose estimation
   approaches




                                                                     24 / 25
Thank You! Questions?




                        25 / 25

Contenu connexe

Tendances

A Novel Methodology for Designing Linear Phase IIR Filters
A Novel Methodology for Designing Linear Phase IIR FiltersA Novel Methodology for Designing Linear Phase IIR Filters
A Novel Methodology for Designing Linear Phase IIR FiltersIDES Editor
 
CG OpenGL line & area-course 3
CG OpenGL line & area-course 3CG OpenGL line & area-course 3
CG OpenGL line & area-course 3fungfung Chen
 
CS 354 Shadows (cont'd) and Scene Graphs
CS 354 Shadows (cont'd) and Scene GraphsCS 354 Shadows (cont'd) and Scene Graphs
CS 354 Shadows (cont'd) and Scene GraphsMark Kilgard
 
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Yusuke Uchida
 
Fcv bio cv_weiss
Fcv bio cv_weissFcv bio cv_weiss
Fcv bio cv_weisszukun
 
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...홍배 김
 
CG OpenGL 3D object representations-course 8
CG OpenGL 3D object representations-course 8CG OpenGL 3D object representations-course 8
CG OpenGL 3D object representations-course 8fungfung Chen
 
Tennis video shot classification based on support vector
Tennis video shot classification based on support vectorTennis video shot classification based on support vector
Tennis video shot classification based on support vectores712
 
Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Varun Ojha
 
Compressed Sensing and Tomography
Compressed Sensing and TomographyCompressed Sensing and Tomography
Compressed Sensing and TomographyGabriel Peyré
 
改进的固定点图像复原算法_英文_阎雪飞
改进的固定点图像复原算法_英文_阎雪飞改进的固定点图像复原算法_英文_阎雪飞
改进的固定点图像复原算法_英文_阎雪飞alen yan
 
Color Img at Prisma Network meeting 2009
Color Img at Prisma Network meeting 2009Color Img at Prisma Network meeting 2009
Color Img at Prisma Network meeting 2009Juan Luis Nieves
 
Sparsity and Compressed Sensing
Sparsity and Compressed SensingSparsity and Compressed Sensing
Sparsity and Compressed SensingGabriel Peyré
 
CS 354 Final Exam Review
CS 354 Final Exam ReviewCS 354 Final Exam Review
CS 354 Final Exam ReviewMark Kilgard
 
Signal Processing Course : Compressed Sensing
Signal Processing Course : Compressed SensingSignal Processing Course : Compressed Sensing
Signal Processing Course : Compressed SensingGabriel Peyré
 

Tendances (20)

Foreground Detection : Combining Background Subspace Learning with Object Smo...
Foreground Detection : Combining Background Subspace Learning with Object Smo...Foreground Detection : Combining Background Subspace Learning with Object Smo...
Foreground Detection : Combining Background Subspace Learning with Object Smo...
 
A Novel Methodology for Designing Linear Phase IIR Filters
A Novel Methodology for Designing Linear Phase IIR FiltersA Novel Methodology for Designing Linear Phase IIR Filters
A Novel Methodology for Designing Linear Phase IIR Filters
 
CG OpenGL line & area-course 3
CG OpenGL line & area-course 3CG OpenGL line & area-course 3
CG OpenGL line & area-course 3
 
CS 354 Shadows (cont'd) and Scene Graphs
CS 354 Shadows (cont'd) and Scene GraphsCS 354 Shadows (cont'd) and Scene Graphs
CS 354 Shadows (cont'd) and Scene Graphs
 
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
 
Fcv bio cv_weiss
Fcv bio cv_weissFcv bio cv_weiss
Fcv bio cv_weiss
 
BMC 2012
BMC 2012BMC 2012
BMC 2012
 
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
ARCHITECTURAL CONDITIONING FOR DISENTANGLEMENT OF OBJECT IDENTITY AND POSTURE...
 
CG OpenGL 3D object representations-course 8
CG OpenGL 3D object representations-course 8CG OpenGL 3D object representations-course 8
CG OpenGL 3D object representations-course 8
 
Deblurring in ct
Deblurring in ctDeblurring in ct
Deblurring in ct
 
Tennis video shot classification based on support vector
Tennis video shot classification based on support vectorTennis video shot classification based on support vector
Tennis video shot classification based on support vector
 
Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)Chapter 1 introduction (Image Processing)
Chapter 1 introduction (Image Processing)
 
Compressed Sensing and Tomography
Compressed Sensing and TomographyCompressed Sensing and Tomography
Compressed Sensing and Tomography
 
改进的固定点图像复原算法_英文_阎雪飞
改进的固定点图像复原算法_英文_阎雪飞改进的固定点图像复原算法_英文_阎雪飞
改进的固定点图像复原算法_英文_阎雪飞
 
Color Img at Prisma Network meeting 2009
Color Img at Prisma Network meeting 2009Color Img at Prisma Network meeting 2009
Color Img at Prisma Network meeting 2009
 
Sparsity and Compressed Sensing
Sparsity and Compressed SensingSparsity and Compressed Sensing
Sparsity and Compressed Sensing
 
CS 354 Final Exam Review
CS 354 Final Exam ReviewCS 354 Final Exam Review
CS 354 Final Exam Review
 
Nanotechnology
NanotechnologyNanotechnology
Nanotechnology
 
Signal Processing Course : Compressed Sensing
Signal Processing Course : Compressed SensingSignal Processing Course : Compressed Sensing
Signal Processing Course : Compressed Sensing
 
Presentation v3.2
Presentation v3.2Presentation v3.2
Presentation v3.2
 

En vedette

Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Jia-Bin Huang
 
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Jia-Bin Huang
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Jia-Bin Huang
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Jia-Bin Huang
 
How to Read Academic Papers
How to Read Academic PapersHow to Read Academic Papers
How to Read Academic PapersJia-Bin Huang
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Jia-Bin Huang
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015Jia-Bin Huang
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialJia-Bin Huang
 
Writing Fast MATLAB Code
Writing Fast MATLAB CodeWriting Fast MATLAB Code
Writing Fast MATLAB CodeJia-Bin Huang
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang
 
Three Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucThree Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucJia-Bin Huang
 
Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Jia-Bin Huang
 
Jia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang
 
UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation Jia-Bin Huang
 
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)Jia-Bin Huang
 
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Jia-Bin Huang
 
What Makes a Creative Photograph?
What Makes a Creative Photograph?What Makes a Creative Photograph?
What Makes a Creative Photograph?Jia-Bin Huang
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash CourseJia-Bin Huang
 
How to come up with new research ideas
How to come up with new research ideasHow to come up with new research ideas
How to come up with new research ideasJia-Bin Huang
 
Research 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXResearch 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXJia-Bin Huang
 

En vedette (20)

Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013Transformation Guided Image Completion ICCP 2013
Transformation Guided Image Completion ICCP 2013
 
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
Toward Accurate and Robust Cross-Ratio based Gaze Trackers Through Learning F...
 
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
Saliency Detection via Divergence Analysis: A Unified Perspective ICPR 2012
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
 
How to Read Academic Papers
How to Read Academic PapersHow to Read Academic Papers
How to Read Academic Papers
 
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
Image Completion using Planar Structure Guidance (SIGGRAPH 2014)
 
Lecture 21 - Image Categorization - Computer Vision Spring2015
Lecture 21 - Image Categorization -  Computer Vision Spring2015Lecture 21 - Image Categorization -  Computer Vision Spring2015
Lecture 21 - Image Categorization - Computer Vision Spring2015
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
 
Writing Fast MATLAB Code
Writing Fast MATLAB CodeWriting Fast MATLAB Code
Writing Fast MATLAB Code
 
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015Lecture 29 Convolutional Neural Networks -  Computer Vision Spring2015
Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015
 
Three Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiucThree Reasons to Join FVE at uiuc
Three Reasons to Join FVE at uiuc
 
Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.Applying for Graduate School in S.T.E.M.
Applying for Graduate School in S.T.E.M.
 
Jia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum VitaeJia-Bin Huang's Curriculum Vitae
Jia-Bin Huang's Curriculum Vitae
 
UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation UIUC CS 498 - Computational Photography - Final project presentation
UIUC CS 498 - Computational Photography - Final project presentation
 
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
美國研究所申請流程 (A Guide for Applying Graduate Schools in USA)
 
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
Single Image Super-Resolution from Transformed Self-Exemplars (CVPR 2015)
 
What Makes a Creative Photograph?
What Makes a Creative Photograph?What Makes a Creative Photograph?
What Makes a Creative Photograph?
 
Computer Vision Crash Course
Computer Vision Crash CourseComputer Vision Crash Course
Computer Vision Crash Course
 
How to come up with new research ideas
How to come up with new research ideasHow to come up with new research ideas
How to come up with new research ideas
 
Research 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeXResearch 101 - Paper Writing with LaTeX
Research 101 - Paper Writing with LaTeX
 

Similaire à Estimating Human Pose from Occluded Images via Sparse Approximation

Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryWai Nwe Tun
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Networkaciijournal
 
Visual Object Tracking: Action-Decision Networks (ADNets)
Visual Object Tracking: Action-Decision Networks (ADNets)Visual Object Tracking: Action-Decision Networks (ADNets)
Visual Object Tracking: Action-Decision Networks (ADNets)Yehya Abouelnaga
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Networkaciijournal
 
IMAGE DE-NOISING USING DEEP NEURAL NETWORK
IMAGE DE-NOISING USING DEEP NEURAL NETWORKIMAGE DE-NOISING USING DEEP NEURAL NETWORK
IMAGE DE-NOISING USING DEEP NEURAL NETWORKaciijournal
 
Robust Real Time Face Detection
Robust Real Time Face DetectionRobust Real Time Face Detection
Robust Real Time Face DetectionSyed Zaid Irshad
 
Digital Image Processing - Image Restoration
Digital Image Processing - Image RestorationDigital Image Processing - Image Restoration
Digital Image Processing - Image RestorationMathankumar S
 
Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Wesley De Neve
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slideswolf
 
Face detection ppt by Batyrbek
Face detection ppt by Batyrbek Face detection ppt by Batyrbek
Face detection ppt by Batyrbek Batyrbek Ryskhan
 
Improvement of Image Deblurring Through Different Methods
Improvement of Image Deblurring Through Different MethodsImprovement of Image Deblurring Through Different Methods
Improvement of Image Deblurring Through Different MethodsIOSR Journals
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...IRJET Journal
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdfNarenRajVivek
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...IRJET Journal
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in ImagesAnil Kumar Gupta
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformFadwa Fouad
 
Uniform and non uniform single image deblurring based on sparse representatio...
Uniform and non uniform single image deblurring based on sparse representatio...Uniform and non uniform single image deblurring based on sparse representatio...
Uniform and non uniform single image deblurring based on sparse representatio...ijma
 
Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1Suvadip Shome
 

Similaire à Estimating Human Pose from Occluded Images via Sparse Approximation (20)

Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Network
 
Visual Object Tracking: Action-Decision Networks (ADNets)
Visual Object Tracking: Action-Decision Networks (ADNets)Visual Object Tracking: Action-Decision Networks (ADNets)
Visual Object Tracking: Action-Decision Networks (ADNets)
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Network
 
IMAGE DE-NOISING USING DEEP NEURAL NETWORK
IMAGE DE-NOISING USING DEEP NEURAL NETWORKIMAGE DE-NOISING USING DEEP NEURAL NETWORK
IMAGE DE-NOISING USING DEEP NEURAL NETWORK
 
Robust Real Time Face Detection
Robust Real Time Face DetectionRobust Real Time Face Detection
Robust Real Time Face Detection
 
Digital Image Processing - Image Restoration
Digital Image Processing - Image RestorationDigital Image Processing - Image Restoration
Digital Image Processing - Image Restoration
 
Lec11 object-re-id
Lec11 object-re-idLec11 object-re-id
Lec11 object-re-id
 
Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...Sparse representation based human action recognition using an action region-a...
Sparse representation based human action recognition using an action region-a...
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Avihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slidesAvihu Efrat's Viola and Jones face detection slides
Avihu Efrat's Viola and Jones face detection slides
 
Face detection ppt by Batyrbek
Face detection ppt by Batyrbek Face detection ppt by Batyrbek
Face detection ppt by Batyrbek
 
Improvement of Image Deblurring Through Different Methods
Improvement of Image Deblurring Through Different MethodsImprovement of Image Deblurring Through Different Methods
Improvement of Image Deblurring Through Different Methods
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdf
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon TransformHuman Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
Human Action Recognition in Videos Employing 2DPCA on 2DHOOF and Radon Transform
 
Uniform and non uniform single image deblurring based on sparse representatio...
Uniform and non uniform single image deblurring based on sparse representatio...Uniform and non uniform single image deblurring based on sparse representatio...
Uniform and non uniform single image deblurring based on sparse representatio...
 
Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1Real-time Face Recognition & Detection Systems 1
Real-time Face Recognition & Detection Systems 1
 

Plus de Jia-Bin Huang

How to write a clear paper
How to write a clear paperHow to write a clear paper
How to write a clear paperJia-Bin Huang
 
Real-time Face Detection and Recognition
Real-time Face Detection and RecognitionReal-time Face Detection and Recognition
Real-time Face Detection and RecognitionJia-Bin Huang
 
Pose aware online visual tracking
Pose aware online visual trackingPose aware online visual tracking
Pose aware online visual trackingJia-Bin Huang
 
Face Expression Enhancement
Face Expression EnhancementFace Expression Enhancement
Face Expression EnhancementJia-Bin Huang
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionJia-Bin Huang
 
Static and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionStatic and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionJia-Bin Huang
 
Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionJia-Bin Huang
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
 
Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...Jia-Bin Huang
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Jia-Bin Huang
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Jia-Bin Huang
 

Plus de Jia-Bin Huang (11)

How to write a clear paper
How to write a clear paperHow to write a clear paper
How to write a clear paper
 
Real-time Face Detection and Recognition
Real-time Face Detection and RecognitionReal-time Face Detection and Recognition
Real-time Face Detection and Recognition
 
Pose aware online visual tracking
Pose aware online visual trackingPose aware online visual tracking
Pose aware online visual tracking
 
Face Expression Enhancement
Face Expression EnhancementFace Expression Enhancement
Face Expression Enhancement
 
Image Smoothing for Structure Extraction
Image Smoothing for Structure ExtractionImage Smoothing for Structure Extraction
Image Smoothing for Structure Extraction
 
Static and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture RecognitionStatic and Dynamic Hand Gesture Recognition
Static and Dynamic Hand Gesture Recognition
 
Real-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes RecognitionReal-Time Face Detection, Tracking, and Attributes Recognition
Real-Time Face Detection, Tracking, and Attributes Recognition
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)
 
Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...Information Preserving Color Transformation for Protanopia and Deuteranopia (...
Information Preserving Color Transformation for Protanopia and Deuteranopia (...
 
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
Enhancing Color Representation for the Color Vision Impaired (CVAVI 2008)
 
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)Learning Moving Cast Shadows for Foreground Detection (VS 2008)
Learning Moving Cast Shadows for Foreground Detection (VS 2008)
 

Estimating Human Pose from Occluded Images via Sparse Approximation

  • 1. Estimating Human Pose from Occluded Images Jia-Bin Huang and Ming-Hsuan Yang {jbhuang,mhyang}@ieee.org Electrical Engineering and Computer Science University of California at Merced ACCV, Xi’ an, September , 2009 1 / 25
  • 2. What this talk is about Estimating human poses from single 2D images as a direct nonlinear regression Recovering poses even when humans in the scenes are partially or heavily occluded via sparse approximation Achieving (implicitly) relevant feature selection 2 / 25
  • 3. Outline 1 Introduction 2 Recovering Poses via Sparse Approximation 3 Experimental Setups 4 Conclusions 3 / 25
  • 4. Introduction 4 / 25
  • 5. Motivation - Applications Motion Analysis Diagnostics of orthopedic patients Analysis and optimization of an athletes’ performances Content-based retrieval and compression of video Auto saftey: control of airbags, drowsiness detection, pedestrian detection, etc. Surveillance People counting or crowd flux, flow, and congestion analysis Analysis of actions, activities, and behaviors both for crowds and individuals Human Computer Interaction Virtual Reality 5 / 25
  • 6. Motivation - Applications Motion Analysis Diagnostics of orthopedic patients Analysis and optimization of an athletes’ performances Content-based retrieval and compression of video Auto saftey: control of airbags, drowsiness detection, pedestrian detection, etc. Surveillance People counting or crowd flux, flow, and congestion analysis Analysis of actions, activities, and behaviors both for crowds and individuals Human Computer Interaction Virtual Reality 5 / 25
  • 7. Motivation - Applications Motion Analysis Diagnostics of orthopedic patients Analysis and optimization of an athletes’ performances Content-based retrieval and compression of video Auto saftey: control of airbags, drowsiness detection, pedestrian detection, etc. Surveillance People counting or crowd flux, flow, and congestion analysis Analysis of actions, activities, and behaviors both for crowds and individuals Human Computer Interaction Virtual Reality 5 / 25
  • 8. Key Challenges Why the problem is difficult? Ambiguity: Loss of depth information Variations of shape and appearance of human body Background clutters Occlusions In this paper, we address the problems of occlusions and background clutters. 6 / 25
  • 9. Related Works Model-based (Generative) Employ a know model (e.g., tree structure) based on prior knowledge Include two parts: 1) Modeling, 2) Estimation [Felzenszwalb, Huttenlocher ’00], [Ioeffe et al. ’01], [Ronfard et al. ’02], [Ramnan, Forsyth ’03], [Mori, Malik, ’04] Model-free (Discriminative) Example-based [Shakhnarovich et al. ’03], [Poppe ’07] Learning-based [Rosales, Sclaroff, ’01], [Agarwal, Triggs ’06], [Sminchisescu et al. ’07], [Bissacco et al. 07], [Okada, Soatto ’08] 7 / 25
  • 10. Recovering Poses via Sparse Approximation 8 / 25
  • 11. Problem Formulation Given: N training samples {x1 , y1 }, {x2 , y2 } · · · , {xN , yN } Input: test sample b Output: pose descriptor vector y Input: single image Output: pose 9 / 25
  • 12. Test Image as a Sparse Linear Combination of Training Images Given N training samples x1 , x2 , · · · , xN ∈ IRm , we represent a test sample b by b = ω1 x1 + ω2 x2 + · · · + ωN xN Let A = [x1 , x2 , · · · , xN ] ∈ IRm×N , b = Aω, where ω = [ω1 , ω2 , . . . , ωN ]T is the coefficient vector. We want to find the sparsest solution of the linear system. 10 / 25
  • 13. Goal: Find the sparsest solution Recover the sparsest solution via 1 -norm minimization 0 -norm solution–NP hard min ω 0 subject to b = Aω ω 1 -norm solution–convex optimization problem min ω 1 subject to b = Aω ω Relaxed constraint on equality to allow small noise min ω 1 subject to b − Aω 2 ≤ ω 11 / 25
  • 14. Coping with Background Clutter and Occlusions When errors occur? 1) Misalignment, 2) Background clutter, 3) Occlusions. Introduce an error term e ω b = Aω + e = [A I] = Bv e Solve the extended linear system min ||v||1 subject to b − Bv 2 ≤ Recovered test sample bR can be represented as Aω 12 / 25
  • 15. Occlusion Recovery: An Example Figure: Occlusion recovery on a synthetic dataset. (a)(b) The original input image and its feature. (c) Corrupted feature via adding random block. (d) Recovered feature via find the sparsest solution. (e) Reconstruction error. 13 / 25
  • 16. Handling Background Clutter: An Example Figure: Feature selection example. (a) Original test image. (b) The HOG feature descriptor computed from (a). (c) Recovered feature vector by our algorithm. (d) The reconstruction error. 14 / 25
  • 18. Experimental Results Settings 1.8 GHz PC with 2 GB RAM Matlab implementation 1 solver 1 magic (second-order cone programming, SOCP) Sparse Lab Many other packages... Regressor: Map image feature to pose vector Gaussian Process Regressor Relevant Vector Regressor [Agarwal, Triggs PAMI ’06] Support Vector Regressor 16 / 25
  • 19. Experimental Results Settings 1.8 GHz PC with 2 GB RAM Matlab implementation 1 solver 1 magic (second-order cone programming, SOCP) Sparse Lab Many other packages... Regressor: Map image feature to pose vector Gaussian Process Regressor Relevant Vector Regressor [Agarwal, Triggs PAMI ’06] Support Vector Regressor 16 / 25
  • 20. Experimental Results Settings 1.8 GHz PC with 2 GB RAM Matlab implementation 1 solver 1 magic (second-order cone programming, SOCP) Sparse Lab Many other packages... Regressor: Map image feature to pose vector Gaussian Process Regressor Relevant Vector Regressor [Agarwal, Triggs PAMI ’06] Support Vector Regressor 16 / 25
  • 21. Robustness to Occlusions Synthetic data [Agarwal, Triggs PAMI ’06] 1927 silhouette image for training and 418 images for testing Image descriptor: PCA and downsampled images Pose vector: 55-dimensional vector describe the joint angle (in degree) Sample test images (a) 0.1 (b) 0.2 (c) 0.3 (d) 0.4 (e) 0.5 (f) 0.6 17 / 25
  • 22. Robustness to Occlusions Results on synthetic data set Estimating pose from original (blue), corrupted (green), and recovered (red) testing images (a) PCA (b) Downsample Figure: Average error of pose estimation on synthetic data set using different features: (a) PCA with 20 coefficients. (b) downsampled (20×20) images. Localized features are more suitable for error correction! 18 / 25
  • 23. Robustness to Occlusions Real Database: HumanEva I [Sigal, Black TR ’06] Synchronized image and motion capture data Four views of four subjects Six predefined actions (walking, jogging, gesturing, throwing/catching, boxing, combo) We use the common motion walking sequences of three subjects from the first camera Synthesized testing images For each image, we randomly generate two occluding blocks with various corruption level for synthesizing images with occlusions. 19 / 25
  • 24. Robustness to Occlusions Real Database: HumanEva I [Sigal, Black TR ’06] Synchronized image and motion capture data Four views of four subjects Six predefined actions (walking, jogging, gesturing, throwing/catching, boxing, combo) We use the common motion walking sequences of three subjects from the first camera Synthesized testing images For each image, we randomly generate two occluding blocks with various corruption level for synthesizing images with occlusions. 19 / 25
  • 25. Robustness to Occlusions HumanEva I: Sample test images (a) 0.1 (b) 0.2 (c) 0.3 (d) 0.4 (e) 0.5 (f) 0.6 20 / 25
  • 26. Robustness to Occlusions Results on HumanEva data set Estimating pose from original (blue), corrupted (green), and recovered (red) testing images (a) S1 (b) S2 (C) S3 Figure: Results of pose estimation on HumanEva data set I in walking sequences. (a) Subject 1. (b) Subject 2. (c) Subject 3. 21 / 25
  • 27. Robustness to Background Clutters Implicit relevant feature selection Compared with the original feature representation (HOG), the estimations induced from recovered feature are better. The improvements of mean position errors (mm) are 4.89, 10.84, and 7.87 for S1, S2, and S3, respectively Figure: Mean 3D error plots for the walking sequences (S2). 22 / 25
  • 28. Conclusions 23 / 25
  • 29. Conclusions Spare approximation can be used for recovering errors resulted from occlusions when estimating humans from occluded images Without occlusions, the recovered features are still better than the original ones The recovery of feature representation is independent from the learning process, i.e., our method can be adopted as a preprocessing module of other model-free pose estimation approaches 24 / 25