SlideShare une entreprise Scribd logo
1  sur  40
Télécharger pour lire hors ligne
Robert Collins
CSE486, Penn State




                         Lecture 31:
                Object Recognition: SIFT Keys
Robert Collins
CSE486, Penn State
                        Motivation
        • Want to recognize a known objects from
          unknown viewpoints.




                              find them in an image
   database of models
Robert Collins

                 Local Feature based Approaches
CSE486, Penn State




         • Represent appearance of object by little
           intensity/feature patches.
         • Try to match patches from object to image
         • Geometrically consistent matches tell you
           the location and pose of the object
Robert Collins
CSE486, Penn State
                           Simple Example
         • Represent object by set of 11x11 intensity
           templates extracted around Harris corners.




                 harris corners    our object “model”
Robert Collins
CSE486, Penn State
                     Simple Example
         • Match patches to new image using NCC.
Robert Collins
CSE486, Penn State
                      Simple Example
         • Find matches consistent with affine
           transformation using RANSAC
Robert Collins
CSE486, Penn State
                       Simple Example
         • Inlier matches let you solve for location and
           pose of object in the image.
Robert Collins
CSE486, Penn State
                     Problem with Simple Example
         Using NCC to match intensity patches puts
           restrictions on the amount of overall rotation and
           scaling allowed between the model and the image
           appearance.
                                   model template

                     ncc               ncc
                           ncc



         matches well            no match       no match
Robert Collins
CSE486, Penn State
                       More General : SIFT Keys




David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of
Computer Vision, 60, 2 (2004), pp. 91-110.
Robert Collins
CSE486, Penn State
                     SIFT Keys: General Idea
        • Reliably extract same image points regardless of
             new magnification and rotation of the image.
        • Normalize image patches, extract feature vector
        • Match feature vectors using correlation
Robert Collins
CSE486, Penn State
                           SIFT Keys: General Idea

         Want to detect/match same features regardless of

         Translation : easy, almost every feature extraction and correlation
         matching algorithm in vision is translation invariant

         Rotation : harder. Guess a canonical orientation for each patch from
         local gradients

         Scaling : hardest of all.     Create a multi-scale representation of the image
         and appeal to scale space theory to determine correct scale at each point.
Robert Collins
CSE486, Penn State
                        Recall: Scale Space
   Basic idea: different scales are appropriate for describing different
   objects in the image, and we may not know the correct scale/size
   ahead of time.
Robert Collins

                     Scale Selection
CSE486, Penn State




                               “Laplacian” operator.
Robert Collins
CSE486, Penn State
                      Local Scale Space Maxima
         Lindeberg proposes that the natural scale for describing a feature
         is the scale at which a normalized derivative for detecting that
         feature achieves a local maximum both spatially and in scale.




      DnormL is the
      DoG operator, in
      this case.                                 Scale


                                Example for
                                blob detection
Robert Collins
CSE486, Penn State
                       Recall: LoG Blob Finding
      LoG filter extrema locates “blobs”
           maxima = dark blobs on light background
           minima = light blobs on dark background

      Scale of blob (size ; radius in pixels) is determined
      by the sigma parameter of the LoG filter.




                     LoG sigma = 2          LoG sigma = 10
Robert Collins
CSE486, Penn State
                     Extrema in Space and Scale

                                                   DOG level L-1
          Scale
          (sigma)                                  DOG level L


                                                   DOG level L+1


                                                        Space
              Hint: when finding maxima or minima at level L,
              use DownSample or UpSample as necessary to make
              DOG images at level L-1 and L+1 the same size as L.
Robert Collins
CSE486, Penn State
                           SIFT Keys: General Idea

         Want to detect/match same features regardless of

         Translation : easy, almost every feature extraction and correlation
         matching algorithm in vision is translation invariant

         Rotation : harder. Guess a canonical orientation for each patch from
         local gradients

         Scaling : hardest of all.     Create a multi-scale representation of the image
         and appeal to scale space theory to determine correct scale at each point.
Robert Collins
CSE486, Penn State
                     Sift Key Steps
Robert Collins
CSE486, Penn State
                     Example




                      •Keypoint location = extrema location
                      •Keypoint scale is scale of the DOG image
Robert Collins
CSE486, Penn State
                                Example (continued)
                                          gradient
                                          magnitude




                     gaussian image
                     (at closest scale,
                     from pyramid)        gradient
                                          orientation
Robert Collins
CSE486, Penn State
                     Example (continued)



                     .*                     =



         gradient         weighted by 2D        weighted gradient
         magnitude        gaussian kernel       magnitude
Robert Collins
CSE486, Penn State
                        Example (continued)
    weighted gradient
    magnitude                  weighted orientation histogram.
                               Each bucket contains sum of weighted gradient
                               magnitudes corresponding to angles that fall within
                               that bucket.




    gradient
    orientation


                              36 buckets
                              10 degree range of angles in each bucket, i.e.
                                  0 <=ang<10 : bucket 1
                                 10<=ang<20 : bucket 2
                                 20<=ang<30 : bucket 3 …
Robert Collins
CSE486, Penn State
                        Example (continued)
    weighted gradient
    magnitude
                              weighted orientation histogram.


                                  peak

                                                         80% of peak value




    gradient
    orientation



                               20-30 degrees

                                Orientation of keypoint
                                is approximately 25 degrees
Robert Collins
CSE486, Penn State
                           Example (continued)
            There may be multiple orientations.

                         peak
                                        Second peak
                                                             80% of peak value




             In this case, generate duplicate keypoints, one with
             orientation at 25 degrees, one at 155 degrees.

             Design decision: you may want to limit number of
             possible multiple peaks to two.
Robert Collins
CSE486, Penn State
                     Example of KeyPoint Detection




            Each keypoint has a center point (location),
            an orientation (rotation) and a radius (scale).
            At this point, we could try to correlate patches
            (after first normalizing to a canonical orientation
            and scale).
Robert Collins
CSE486, Penn State
                        SIFT Vector

     to make things more insensitive to changes in
     lighting or small changes in geometry, Lowe
     constructs feature vector from image gradients.
Robert Collins
CSE486, Penn State
                     SIFT Vector
Robert Collins
CSE486, Penn State
                     Sift Key Matching
Robert Collins
CSE486, Penn State
                     Model Verification
Robert Collins

                 Application: Object Recognition
CSE486, Penn State




                                    Compute SIFT keys
                                    of models and store
                                    in a database
Robert Collins

                 Application: Object Recognition
CSE486, Penn State




                                find them in an image
  database of models
Robert Collins

                 Application: Object Recognition
CSE486, Penn State




         For sets of 3 SIFT key matches, compute affine
         transformation and perform model verification.
Robert Collins

                 Application: Object Recognition
CSE486, Penn State




     Note: since these are local, parts-based descriptors,
     they perform well even when some parts are missing
     (i.e. under occlusion).
Robert Collins

               Application: Landmark Recognition
CSE486, Penn State
Robert Collins
CSE486, Penn State
                         Application: Generating Panoramas
Brown and Lowe, ICCV03
Robert Collins

               Application: Generating Panoramas
CSE486, Penn State




Brown and Lowe, ICCV03
Robert Collins
CSE486, Penn State
                           State of the Art
             Fully affine invariant local feature descriptors.
Robert Collins
CSE486, Penn State
                       State of the Art
     Affine invariant descriptors can handle larger changes
     in viewpoint.
Robert Collins
CSE486, Penn State
                     State of the Art
Robert Collins
CSE486, Penn State
                            For More Information
       SIFT keys
          David G. Lowe, "Distinctive image features from scale-invariant keypoints,"
          International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.

       Affine local-feature methods

Contenu connexe

En vedette

Machine Learning Experimentation at Sift Science
Machine Learning Experimentation at Sift ScienceMachine Learning Experimentation at Sift Science
Machine Learning Experimentation at Sift ScienceSift Science
 
Michal Erel's SIFT presentation
Michal Erel's SIFT presentationMichal Erel's SIFT presentation
Michal Erel's SIFT presentationwolf
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extractionRushin Shah
 
LinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedLinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedSlideShare
 

En vedette (6)

SIFT
SIFTSIFT
SIFT
 
Machine Learning Experimentation at Sift Science
Machine Learning Experimentation at Sift ScienceMachine Learning Experimentation at Sift Science
Machine Learning Experimentation at Sift Science
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
Michal Erel's SIFT presentation
Michal Erel's SIFT presentationMichal Erel's SIFT presentation
Michal Erel's SIFT presentation
 
Image feature extraction
Image feature extractionImage feature extraction
Image feature extraction
 
LinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedLinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-Presented
 

Similaire à SIFT Keys: Object Recognition Lecture

Lecture07
Lecture07Lecture07
Lecture07zukun
 
Lecture23
Lecture23Lecture23
Lecture23zukun
 
Lecture17
Lecture17Lecture17
Lecture17zukun
 
Lecture05
Lecture05Lecture05
Lecture05zukun
 
Lecture06
Lecture06Lecture06
Lecture06zukun
 
Lecture14
Lecture14Lecture14
Lecture14zukun
 
Lecture28
Lecture28Lecture28
Lecture28zukun
 
Lecture28
Lecture28Lecture28
Lecture28zukun
 
Lecture04
Lecture04Lecture04
Lecture04zukun
 
Lecture10
Lecture10Lecture10
Lecture10zukun
 
Lecture22
Lecture22Lecture22
Lecture22zukun
 
Lecture20
Lecture20Lecture20
Lecture20zukun
 
Lecture01
Lecture01Lecture01
Lecture01zukun
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSLiemNguyenDuy
 
Lecture19
Lecture19Lecture19
Lecture19zukun
 
Lecture21
Lecture21Lecture21
Lecture21zukun
 
Lecture21
Lecture21Lecture21
Lecture21zukun
 

Similaire à SIFT Keys: Object Recognition Lecture (18)

Lecture07
Lecture07Lecture07
Lecture07
 
Lecture23
Lecture23Lecture23
Lecture23
 
Lecture17
Lecture17Lecture17
Lecture17
 
Lecture05
Lecture05Lecture05
Lecture05
 
Lecture06
Lecture06Lecture06
Lecture06
 
Lecture14
Lecture14Lecture14
Lecture14
 
Lecture28
Lecture28Lecture28
Lecture28
 
Lecture28
Lecture28Lecture28
Lecture28
 
Lecture04
Lecture04Lecture04
Lecture04
 
Lecture10
Lecture10Lecture10
Lecture10
 
Lecture22
Lecture22Lecture22
Lecture22
 
Lecture20
Lecture20Lecture20
Lecture20
 
Lecture01
Lecture01Lecture01
Lecture01
 
Lec10 alignment
Lec10 alignmentLec10 alignment
Lec10 alignment
 
SPATIAL POINT PATTERNS
SPATIAL POINT PATTERNSSPATIAL POINT PATTERNS
SPATIAL POINT PATTERNS
 
Lecture19
Lecture19Lecture19
Lecture19
 
Lecture21
Lecture21Lecture21
Lecture21
 
Lecture21
Lecture21Lecture21
Lecture21
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

SIFT Keys: Object Recognition Lecture

  • 1. Robert Collins CSE486, Penn State Lecture 31: Object Recognition: SIFT Keys
  • 2. Robert Collins CSE486, Penn State Motivation • Want to recognize a known objects from unknown viewpoints. find them in an image database of models
  • 3. Robert Collins Local Feature based Approaches CSE486, Penn State • Represent appearance of object by little intensity/feature patches. • Try to match patches from object to image • Geometrically consistent matches tell you the location and pose of the object
  • 4. Robert Collins CSE486, Penn State Simple Example • Represent object by set of 11x11 intensity templates extracted around Harris corners. harris corners our object “model”
  • 5. Robert Collins CSE486, Penn State Simple Example • Match patches to new image using NCC.
  • 6. Robert Collins CSE486, Penn State Simple Example • Find matches consistent with affine transformation using RANSAC
  • 7. Robert Collins CSE486, Penn State Simple Example • Inlier matches let you solve for location and pose of object in the image.
  • 8. Robert Collins CSE486, Penn State Problem with Simple Example Using NCC to match intensity patches puts restrictions on the amount of overall rotation and scaling allowed between the model and the image appearance. model template ncc ncc ncc matches well no match no match
  • 9. Robert Collins CSE486, Penn State More General : SIFT Keys David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  • 10. Robert Collins CSE486, Penn State SIFT Keys: General Idea • Reliably extract same image points regardless of new magnification and rotation of the image. • Normalize image patches, extract feature vector • Match feature vectors using correlation
  • 11. Robert Collins CSE486, Penn State SIFT Keys: General Idea Want to detect/match same features regardless of Translation : easy, almost every feature extraction and correlation matching algorithm in vision is translation invariant Rotation : harder. Guess a canonical orientation for each patch from local gradients Scaling : hardest of all. Create a multi-scale representation of the image and appeal to scale space theory to determine correct scale at each point.
  • 12. Robert Collins CSE486, Penn State Recall: Scale Space Basic idea: different scales are appropriate for describing different objects in the image, and we may not know the correct scale/size ahead of time.
  • 13. Robert Collins Scale Selection CSE486, Penn State “Laplacian” operator.
  • 14. Robert Collins CSE486, Penn State Local Scale Space Maxima Lindeberg proposes that the natural scale for describing a feature is the scale at which a normalized derivative for detecting that feature achieves a local maximum both spatially and in scale. DnormL is the DoG operator, in this case. Scale Example for blob detection
  • 15. Robert Collins CSE486, Penn State Recall: LoG Blob Finding LoG filter extrema locates “blobs” maxima = dark blobs on light background minima = light blobs on dark background Scale of blob (size ; radius in pixels) is determined by the sigma parameter of the LoG filter. LoG sigma = 2 LoG sigma = 10
  • 16. Robert Collins CSE486, Penn State Extrema in Space and Scale DOG level L-1 Scale (sigma) DOG level L DOG level L+1 Space Hint: when finding maxima or minima at level L, use DownSample or UpSample as necessary to make DOG images at level L-1 and L+1 the same size as L.
  • 17. Robert Collins CSE486, Penn State SIFT Keys: General Idea Want to detect/match same features regardless of Translation : easy, almost every feature extraction and correlation matching algorithm in vision is translation invariant Rotation : harder. Guess a canonical orientation for each patch from local gradients Scaling : hardest of all. Create a multi-scale representation of the image and appeal to scale space theory to determine correct scale at each point.
  • 18. Robert Collins CSE486, Penn State Sift Key Steps
  • 19. Robert Collins CSE486, Penn State Example •Keypoint location = extrema location •Keypoint scale is scale of the DOG image
  • 20. Robert Collins CSE486, Penn State Example (continued) gradient magnitude gaussian image (at closest scale, from pyramid) gradient orientation
  • 21. Robert Collins CSE486, Penn State Example (continued) .* = gradient weighted by 2D weighted gradient magnitude gaussian kernel magnitude
  • 22. Robert Collins CSE486, Penn State Example (continued) weighted gradient magnitude weighted orientation histogram. Each bucket contains sum of weighted gradient magnitudes corresponding to angles that fall within that bucket. gradient orientation 36 buckets 10 degree range of angles in each bucket, i.e. 0 <=ang<10 : bucket 1 10<=ang<20 : bucket 2 20<=ang<30 : bucket 3 …
  • 23. Robert Collins CSE486, Penn State Example (continued) weighted gradient magnitude weighted orientation histogram. peak 80% of peak value gradient orientation 20-30 degrees Orientation of keypoint is approximately 25 degrees
  • 24. Robert Collins CSE486, Penn State Example (continued) There may be multiple orientations. peak Second peak 80% of peak value In this case, generate duplicate keypoints, one with orientation at 25 degrees, one at 155 degrees. Design decision: you may want to limit number of possible multiple peaks to two.
  • 25. Robert Collins CSE486, Penn State Example of KeyPoint Detection Each keypoint has a center point (location), an orientation (rotation) and a radius (scale). At this point, we could try to correlate patches (after first normalizing to a canonical orientation and scale).
  • 26. Robert Collins CSE486, Penn State SIFT Vector to make things more insensitive to changes in lighting or small changes in geometry, Lowe constructs feature vector from image gradients.
  • 27. Robert Collins CSE486, Penn State SIFT Vector
  • 28. Robert Collins CSE486, Penn State Sift Key Matching
  • 29. Robert Collins CSE486, Penn State Model Verification
  • 30. Robert Collins Application: Object Recognition CSE486, Penn State Compute SIFT keys of models and store in a database
  • 31. Robert Collins Application: Object Recognition CSE486, Penn State find them in an image database of models
  • 32. Robert Collins Application: Object Recognition CSE486, Penn State For sets of 3 SIFT key matches, compute affine transformation and perform model verification.
  • 33. Robert Collins Application: Object Recognition CSE486, Penn State Note: since these are local, parts-based descriptors, they perform well even when some parts are missing (i.e. under occlusion).
  • 34. Robert Collins Application: Landmark Recognition CSE486, Penn State
  • 35. Robert Collins CSE486, Penn State Application: Generating Panoramas Brown and Lowe, ICCV03
  • 36. Robert Collins Application: Generating Panoramas CSE486, Penn State Brown and Lowe, ICCV03
  • 37. Robert Collins CSE486, Penn State State of the Art Fully affine invariant local feature descriptors.
  • 38. Robert Collins CSE486, Penn State State of the Art Affine invariant descriptors can handle larger changes in viewpoint.
  • 39. Robert Collins CSE486, Penn State State of the Art
  • 40. Robert Collins CSE486, Penn State For More Information SIFT keys David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110. Affine local-feature methods