SlideShare une entreprise Scribd logo
1  sur  13
Télécharger pour lire hors ligne
Object Recognition with Pictorial Structures

                Pedro F. Felzenszwalb
                University of Chicago
                 pff@cs.uchicago.edu

        Joint work with Daniel P. Huttenlocher
Pictorial structures


Part-based representation:

 • Each part models local visual properties.

 • “Springs” model spatial relationships.

 • Joint estimation of part locations.

    – No hard detection of parts or features.

    – No initialization parameters.


                                                1
• Model is represented by a graph G = (V, E).

   – V = {v1, . . . , vn} are the parts.

   – (vi, vj ) ∈ E indicates a connection between parts.

• mi(li) is the cost of placing part i at location li.

• dij (li, lj ) is a deformation cost.

• Optimal location for object is given by L∗ = (l1, . . . , ln),
                                                 ∗           ∗
                                                            
                          n
            L∗ = argmin     mi(li) +            dij (li, lj )
                                                              
                        
                    L    i=1          (vi,vj )∈E



                                                                  2
Efficient minimization

                                                           
                         n
          L∗ = argmin      mi(li) +            dij (li, lj )
                                                            
                  L     i=1          (vi,vj )∈E

• n parts and h locations gives hn configurations.

• If graph is a tree we can use dynamic programming.

   – O(nh2), much better but still slow.

• If dij (li, lj ) = ||Tij (li) − Tji(lj )||2 can use DT.

   – O(nh), as good as matching each part separately!!

                                                                 3
Distance transform
 Given a set of points on a grid P ⊆ G,
the quadratic distance transform of P is,


          DP (q) = min ||q − p||2
                    p∈P




           P                DP

                                            4
Generalized distance transform


Given a function f : G → R,

                 Df (q) = min ||q − p||2 + f (p)
                              p∈G

 – for each location q, find nearby location p with f (p) small.

 – equals DT of points P if f is an indicator function.
                                    
                                    0   if p ∈ P
                       f (p) =                     .
                                    ∞   otherwise



                                                           5
1D case:      Df (q) = minp∈G (q − p)2 + f (p)

For each p, Df (q) is below the parabola rooted at (p, f (p)).

Df (q) is defined by the lower envelope of h parabolas.
                                          1
                          f




                              (




                                              )
                                      2
                          f




                              (




                                              )
                                  §
                      h




                                          1
                  f




                      (




                                              )
                                      0
                          f




                              (




                                              )




                                                                                                                      §
                                                              .




                                                                  .




                                                                      .




                                                                          .




                                                                              .




                                                                                  .




                                                                                      .




                                                                                          .




                                                                                              .




                                                                                                  .




                                                                                                      .




                                                                                                          .




                                                                                                              .
                                                  0




                                                      1




                                                          2




                                                                                                                  h




                                                                                                                          1



                                                                                                                              6
There is a simple geometric algorithm that computes Df (p) in
O(h) time for the 1D case.

 – similar to Graham’s scan convex hull algorithm.

 – about 20 lines of C code.


The 2D case is “separable”, it can be solved by sequential 1D
transformations along rows and columns of the grid.

See Distance Transforms of Sampled Functions, Felzen-
szwalb and Huttenlocher.




                                                        7
Simple face model

• Locations are positions in the image grid.

• Match cost mi(li) for placing part i at li.

• Central part v1 - the nose.

• Each part has an ideal position pi relative to nose.

  – Let T1i(l1) = l1 + pi,

                               n                n
        E(l1, . . . , ln) =         mi(li) +         ||li − T1i(l1)||2
                              i=1              i=2


                                                                         8
Efficient minimization

                                                            
                   n                n
L∗ = argmin            mi(li) +             ||li − T1i(l1)||2
        L         i=1              i=2
                                                                 
                              n
L∗ = argmin m1(l1) +              mi(li) + ||li − T1i(l1)||2
        L                    i=2
                                                                     
                              n
 ∗
l1 = argmin m1(l1) +              min(mi(li) + ||li − T1i(l1)||2)
        l1                   i=2        li

                                                       
                              n
 ∗
l1 = argmin m1(l1) +              Dmi (T1i(l1))
        l1                   i=2
                                                                      9
Matching results




                   10
Matching results




                   11
Summary


• Generic framework for part-based modeling.


• Global minimization for deformable objects can be fast.


• Soft detection avoids unnecessary early decisions.


• Partial occlusion is handled automatically.



                                                        12

Contenu connexe

Tendances

Trigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measuresTrigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measures
Nene Thomas
 
Trigonometry%20 to%20find%20lengths
Trigonometry%20 to%20find%20lengthsTrigonometry%20 to%20find%20lengths
Trigonometry%20 to%20find%20lengths
Nene Thomas
 
Form 5 formulae and note
Form 5 formulae and noteForm 5 formulae and note
Form 5 formulae and note
smktsj2
 
02 iec t1_s1_oo_ps_session_02
02 iec t1_s1_oo_ps_session_0202 iec t1_s1_oo_ps_session_02
02 iec t1_s1_oo_ps_session_02
Niit Care
 
Practical Meta Programming
Practical Meta ProgrammingPractical Meta Programming
Practical Meta Programming
Reggie Meisler
 
Generic Image Processing With Climb - Slides
Generic Image Processing With Climb - SlidesGeneric Image Processing With Climb - Slides
Generic Image Processing With Climb - Slides
Laurent Senta
 

Tendances (19)

C++ Chapter I
C++ Chapter IC++ Chapter I
C++ Chapter I
 
Chemisty Stream (2013-January) Question Papers
Chemisty  Stream (2013-January) Question PapersChemisty  Stream (2013-January) Question Papers
Chemisty Stream (2013-January) Question Papers
 
Trigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measuresTrigonometry%20to%20 find%20angle%20measures
Trigonometry%20to%20 find%20angle%20measures
 
Trigonometry%20 to%20find%20lengths
Trigonometry%20 to%20find%20lengthsTrigonometry%20 to%20find%20lengths
Trigonometry%20 to%20find%20lengths
 
Form 5 formulae and note
Form 5 formulae and noteForm 5 formulae and note
Form 5 formulae and note
 
Identity Based Encryption
Identity Based EncryptionIdentity Based Encryption
Identity Based Encryption
 
Influence of Signal-to-Noise Ratio and Point Spread Function on Limits of Sup...
Influence of Signal-to-Noise Ratio and Point Spread Function on Limits of Sup...Influence of Signal-to-Noise Ratio and Point Spread Function on Limits of Sup...
Influence of Signal-to-Noise Ratio and Point Spread Function on Limits of Sup...
 
02 iec t1_s1_oo_ps_session_02
02 iec t1_s1_oo_ps_session_0202 iec t1_s1_oo_ps_session_02
02 iec t1_s1_oo_ps_session_02
 
Lesson 1: Functions
Lesson 1: FunctionsLesson 1: Functions
Lesson 1: Functions
 
Embedded systems
Embedded systemsEmbedded systems
Embedded systems
 
Matlab
MatlabMatlab
Matlab
 
Advanced C programming
Advanced C programmingAdvanced C programming
Advanced C programming
 
Practical Meta Programming
Practical Meta ProgrammingPractical Meta Programming
Practical Meta Programming
 
Tipos de funciones
Tipos de funcionesTipos de funciones
Tipos de funciones
 
Numeros en mandarin
Numeros en mandarinNumeros en mandarin
Numeros en mandarin
 
Lesson03 The Concept Of Limit 027 Slides
Lesson03   The Concept Of Limit 027 SlidesLesson03   The Concept Of Limit 027 Slides
Lesson03 The Concept Of Limit 027 Slides
 
Lesson 20: Derivatives and the Shapes of Curves
Lesson 20: Derivatives and the Shapes of CurvesLesson 20: Derivatives and the Shapes of Curves
Lesson 20: Derivatives and the Shapes of Curves
 
MATHEON Center Days: Index determination and structural analysis using Algori...
MATHEON Center Days: Index determination and structural analysis using Algori...MATHEON Center Days: Index determination and structural analysis using Algori...
MATHEON Center Days: Index determination and structural analysis using Algori...
 
Generic Image Processing With Climb - Slides
Generic Image Processing With Climb - SlidesGeneric Image Processing With Climb - Slides
Generic Image Processing With Climb - Slides
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
zukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
zukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
zukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
zukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
zukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 

Dernier

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Object recognition with pictorial structures

  • 1. Object Recognition with Pictorial Structures Pedro F. Felzenszwalb University of Chicago pff@cs.uchicago.edu Joint work with Daniel P. Huttenlocher
  • 2. Pictorial structures Part-based representation: • Each part models local visual properties. • “Springs” model spatial relationships. • Joint estimation of part locations. – No hard detection of parts or features. – No initialization parameters. 1
  • 3. • Model is represented by a graph G = (V, E). – V = {v1, . . . , vn} are the parts. – (vi, vj ) ∈ E indicates a connection between parts. • mi(li) is the cost of placing part i at location li. • dij (li, lj ) is a deformation cost. • Optimal location for object is given by L∗ = (l1, . . . , ln), ∗ ∗   n L∗ = argmin  mi(li) + dij (li, lj )   L i=1 (vi,vj )∈E 2
  • 4. Efficient minimization   n L∗ = argmin  mi(li) + dij (li, lj )   L i=1 (vi,vj )∈E • n parts and h locations gives hn configurations. • If graph is a tree we can use dynamic programming. – O(nh2), much better but still slow. • If dij (li, lj ) = ||Tij (li) − Tji(lj )||2 can use DT. – O(nh), as good as matching each part separately!! 3
  • 5. Distance transform Given a set of points on a grid P ⊆ G, the quadratic distance transform of P is, DP (q) = min ||q − p||2 p∈P P DP 4
  • 6. Generalized distance transform Given a function f : G → R, Df (q) = min ||q − p||2 + f (p) p∈G – for each location q, find nearby location p with f (p) small. – equals DT of points P if f is an indicator function.  0 if p ∈ P f (p) = . ∞ otherwise 5
  • 7. 1D case: Df (q) = minp∈G (q − p)2 + f (p) For each p, Df (q) is below the parabola rooted at (p, f (p)). Df (q) is defined by the lower envelope of h parabolas. 1 f ( ) 2 f ( ) § h 1 f ( ) 0 f ( ) § . . . . . . . . . . . . . 0 1 2 h 1 6
  • 8. There is a simple geometric algorithm that computes Df (p) in O(h) time for the 1D case. – similar to Graham’s scan convex hull algorithm. – about 20 lines of C code. The 2D case is “separable”, it can be solved by sequential 1D transformations along rows and columns of the grid. See Distance Transforms of Sampled Functions, Felzen- szwalb and Huttenlocher. 7
  • 9. Simple face model • Locations are positions in the image grid. • Match cost mi(li) for placing part i at li. • Central part v1 - the nose. • Each part has an ideal position pi relative to nose. – Let T1i(l1) = l1 + pi, n n E(l1, . . . , ln) = mi(li) + ||li − T1i(l1)||2 i=1 i=2 8
  • 10. Efficient minimization   n n L∗ = argmin  mi(li) + ||li − T1i(l1)||2 L i=1 i=2   n L∗ = argmin m1(l1) + mi(li) + ||li − T1i(l1)||2 L i=2   n ∗ l1 = argmin m1(l1) + min(mi(li) + ||li − T1i(l1)||2) l1 i=2 li   n ∗ l1 = argmin m1(l1) + Dmi (T1i(l1)) l1 i=2 9
  • 13. Summary • Generic framework for part-based modeling. • Global minimization for deformable objects can be fast. • Soft detection avoids unnecessary early decisions. • Partial occlusion is handled automatically. 12