SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Compositional models
                         Pedro Felzenszwalb
                          Brown University




Tuesday, August 23, 11
Deformable models
        • Can take us a long way...

        • But not all the way




                           (a)         (b)
Tuesday, August 23, 11
Structure variation
        • Object in rich categories have variable structure




        • These are NOT deformations

        • There is always something you never saw before

        • Mixture of deformable models? too many combined choices

        • Bag of words? not enough structure

        • Non-parametric? doesn’t generalize
Tuesday, August 23, 11
Structure variation
        • Object in rich categories have variable structure




        • These are NOT deformations

        • There is always something you never saw before

        • Mixture of deformable models? too many combined choices

        • Bag of words? Compositional model
                        not enough structure

        • Non-parametric? doesn’t generalize
Tuesday, August 23, 11
Object detection grammars

        • Pictorial structure model with variable structure



        • Stochastic context-free grammar

             - Generates tree-structured model

             - Springs connect symbols along derivation tree

             - Appearance model associated with each terminal




Tuesday, August 23, 11
- person -> face, trunk, arms, lower-part

             - face -> hat, eyes, nose, mouth

             - face -> eyes, nose, mouth

             - hat -> baseball-cap

             - hat -> sombrero

             - lower-part -> shoe, shoe, legs

             - lower-part -> bare-foot, bare-foot, legs

             - legs -> pants

             - legs -> skirt

Tuesday, August 23, 11
Person detection grammar
    108                 Subtype 1 Subtype 2                     Example detections and derived filters
    109
    110        Part 1
    111
    112        Part 2
    113
               Part 3
    114
               Part 4
    115
    116        Part 5
    117        Part 6
    118
            Occluder                          Parts 1-6 (no occlusion)       Parts 1-4 & occluder   Parts 1-2 & occluder
    119
    120
            Figure 1: Shallow grammar model. This figure illustrates a shallow version of our grammar model
      •
    121   Instantiation includes six variable number of parts(“occluder”), each of which
            (Section 2.1). This model has a person parts and an occlusion model
    122
            comes in one of two subtypes. A detection places one subtype of each visible part at a location and
    123
    124
          - scale in toand relative toderivation < 6are constrained by deformation penalties.
             1,...,k move occluder if k does not place all parts it must place the occluder. Parts are
            allowed
                     the image. If the
                                       each other but
    125
    126
      • Parts can translateproductions specified by two kinds of schemas (a schema is a template for
         We consider models with relative to each other
    127
            generating productions). A structure schema specifies one production for each placement ! 2 ⌦,
    128
      • Parts have subtypes
    129                   X(!)
                                                 s
                                                 ! { Y1 (!        1 ), . . . , Yn (!   n)   }.                       (3)
    130
    131    Here the i specify constant displacements within the feature map pyramid. Structure schemas can
      •
    132
          Parts have deformable sub-parts other objects.
           be used to define decompositions of objects into (not shown)
     133         Let   be the set of possible displacements within a single scale of a feature map pyramid.           A
     134         deformation schema specifies one production for each placement ! 2 ⌦ and displacement 2
      • Beats all other methods on PASCAL 2010 (49.5 AP)                                                             ,
     135
                                                         ↵· ( )
     136
Tuesday, August 23, 11
                                                   X(!) ! { Y (!          ) }.                                       (4)
O(!)         ! { Ot (!) }                Ot (!)          !        { At (!        )}
175
176 subtypes The mixture start symbol [12] hassix alternate choices that derive component. The
Part     The grammar has a model from Q with two subtypes for each mixture people under varying de-

                                             Building the model
     Partare forced to be mixture modelof each has corresponding nonterminal Yp that is placed The
            subtypes The (occlusion). from [12] has two subtypes roughly mixture component.
subtypesgrees of visibilitymirror imagesEach partotheraand correspond for each to left-facing people at some
177
     Part subtypes The mixture model from [12] hasother subtypes for part, which to left-facing The
     subtypes are forced to be mirror images of each two and correspond roughly arecomponent. The
          subtypes The mixture model from [12] has two subtypes for each mixture also forced
andPart ideal position relative to Q. Derivations with occlusion for each each mixture component.people
178 right-facing people. Our grammar model has two subtypesinclude the occlusion symbol O. A derivation
to besubtypes aresubtypeto bedisplacement for eachhas two and correspond roughlywhich are also people
     andselects forced be Our grammar of each our grammar model,each part, toto grammar people
          right-facing people. mirror images of each other subtypes for roughly          left-facing forced
    subtypes areaforced toand mirror images model other and correspondthe decisionleft-facing (production
179 mirror images of each other. But in the case of visible part. The parameters of theof which part
subtypebe mirror imagesdetectiongrammar model has two our grammar each part, which procedure forced
    and right-facing people. each other. isandmodelcasefor each part.for model,part, decision ofalso described
     andto instantiate people.Our time But in the has of subtypes for each the which are which part
     to scores, deformation parameters independent two subtypes the discriminative are also forced
          right-facing at of Our grammar filters) are learned with
180 to be mirror images ofat detection timein the case of our grammar model, the decision of which part
     subtype to instantiate each other. But is independent for each part.
     to be in Section 4. Figure 1 illustrates the the case of our grammar model, the decision of which part
            mirror images of each other. But in filters in the resulting model and some example detections.
181 subtype Type grammar detection defined independent for each part. The indices p (for part), t
       subtypeperson in any non-recursive grammar
          • to instantiate detection time independent for each part.
The shallowto instantiate atat model istime isis by the following grammar.
182 subtype), and person grammar modelshallow 2 {1,the. followingdeformableThe indices,p (forscales: t (1)
(for TheDeeper model We following ranges: p model. . , 6}, t 2 {L, R} and ksubparts. at 5}. part),
             shallow k have the extend the is defined by by adding grammar. 2 {1, . . two
       Thethe sameperson grammar followingdefined by 2 {1, . . . ,symbol Q. R} andindices p.(for 5}. t t
             subtype), and k (2) twice the is ranges: p the start 6}, t 2 {L, The indices p . . part),
       (for shallow as, andhave the model resolution of the following grammar. The k 2 {1,(for , part),
183 The shallow person grammar model is defined by the following grammar. When detecting large objects,
             subtype), and have the following ranges: details. , 6}, 2 {L, R} and 2 {1, , 5}.
       (forhigh-resolution have kthe capture fine image p {1, . However, {L, R}detecting {1, . .. ,.5}.
     (for subtype), and kksubpartsfollowing ranges:. , Y 2 {1, .. .. ,. 6}, t t 2when ) and k k 2small..objects, high-
                                    s                                 p 2(!
184                      Q(!)        ! sk Y1 (!
                                             {            1 ), . .        k            k ), O(!           k+1 }
           resolution Q(!)     Q(!)
                         subparts ! sk{ Y (!
                                     cannot  !be { Y1 (!             ), they Yk (! off the O(!
                                                                        . . . , “fall            ), bottom” k+1the feature map pyramid.
                                                                                                                 of ) }
                                            sk 1 used because Y6 (!
                                    s6                             1                           k
185                                                         ), . .),), .. .. ,. YY(! 6 ) } ),), O(!
                                                                   . ,.
                                             ! { 1 1 (! 1 1 ), . . . , k k (! when detecting ) ) } objects.
           The model usesQ(!) ! withY(!
                                            s6            1
186
                                derivations { { Ylow-resolution ,subparts k k )O(!
                             Q(!)
                               Q(!)          ! Y 1 (!              1             Y6 (!         6 }              k+1 }
                                                                                                               k+1 small
                                          s s6
                             Q(!) {!p,t (!)Y(! Y1 ),), .. .. ,. Y↵p,t! p,t ·) { } p,t (!
                               Q(!) 0 6! { { 1}1 (! p,t (!) , Y(!· ↵ ) 6 6 ( A
                                  0
                                                                                 6 6 (!
                                                                                        (
                                                                                                ))
           We begin (!) replacing! productions from p,t (!) the grammar Ap,t (! and ) } adding new pro-           ) } then
                     Yp by        !         Y     Y                1 .                          }
187                        Yp (!)          the { Yp,t (!) }             Y  Yp,t in           !) ) { above,
188        ductions. Recall that 0 {indexesp,t (!) } OtYYp,t (!) ·!↵indexes{subtypes. In )the}following schemas,
                         YYp (!)
                                  0
                                         ! Yp,t (!)                  (!)parts ↵t ↵(↵p,t·!
                      O(!)(!) ! ! O{(!) } top-levelp,t (!)and p,t! ({)At (!Ap,t (!
                                       p0 t { Y the }                                   t )t ·( (
                                                                                           ·
                                                                                                                )}      )
                                                                         Ot (!) ↵ · ·(! ) { Ar(!
                                         0                                                               {
                                                                                                         A              }
189
                           pO(!)         ! { Ot (!) } subpart) have↵the() ranges:p,t(! {H, L}, u 2 {1, . . . , Np },
           the indices r (for resolution) and u (for                                                          t 2     )}
                                       00
                          O(!)number Q subparts alternateOt (!) Y!
                          start symbol { with(!) in                    Otchoices p! { AAt (! ) ) }
The grammar haspais O(!) ! of { O(!) } } a top-level partthat derive{people under varying de-
                                                                            (!)                            t (!
                                                                                        t t
                                         ! Ot t six                                                                   }
190 Thewhere N has a start symbol Q with six alternate choices that derive people under varying de-
             grammar
                            the                                                              .
grees of visibility (occlusion). Each part has a corresponding nonterminal Yp that is placed at some
191 The grammar has a (occlusion). Each with has alternate choicesnonterminalpeople under varying de-
ideal The grammar has aQ. Derivations with occlusion include the occlusion symbolthat A derivationsome
       grees of visibility start ↵p,t · ( ) Q with six alternate choices that derive people under varying de-
        position relative to start symbol Q part six a corresponding that derive Yp O. is placed at
                                      symbol
192 grees of visibility(!) to Q. !Each part p,t (! occlusion includenonterminal Yp that isO. A derivation
                                                                         )}
selects a subtype and displacement for each visibleapart. The parametersocclusion symbolplaced atat some
       grees position p,t (occlusion). Each{part has corresponding nonterminal grammar (production
       ideal of visibility (occlusion).
                       Y relative           Derivations witha corresponding the of the Yp that is placed some
                                                       Z has
193 ideal positionZ relative displacement forwith occlusion include parameters,of symbol (! AA derivation
       ideal position p,t (!) to Q. Derivations are(!), Wp,t,r,1 (!The p,t,r,1occlusionp,t,r,N O.described )}
       selects a subtype and to Q. and filters) each visiblewithinclude the ), . . . Wthe grammar (production
                       relative            Derivationsp,t learned part.the discriminative procedureO. derivation
                                           0
                                                    {A with occlusion                        the occlusion symbol
scores, deformation parameters !                                                                                        p         p,t,r,Np
194  selects a a subtype andparameters and filters) visible part. with parameters of the grammar (production
       selects subtype 1 illustrates the(filterseach visible part. The parameters example procedure described
       scores, deformationdisplacement for each are learned The the discriminative
                           and displacement for                                                             of the grammar (production
in Section 4. Wp,t,r,u (!) ↵p,t,r,u · ) {Ap,t,r,uresulting model and some
                  Figure                               in the                                                         detections.
195    scores, deformation parameters and filters)in(! learned with the discriminative procedure described
                                           !                                    )}
     scores, deformation parameters and filters) are the resulting model and some example detections.
       in Section 4. Figure 1 illustrates the filters are             learned with the discriminative procedure described
DeeperSection 4. Figure 1 the shallowfilters inbythe resulting model and some examplescales: (1)
       in model We
196 in Sectionmodel extendillustrates shallow model resulting model and some example detections. (1)
                   4. Figure 1 illustrates the model in adding deformable subparts at two detections.
                                                 the filters the
       Deepernote that Wein [22] our model has hierarchical deformations. The part terminal Ap,t can move
           We and (2) as extendresolution of the start symbol Q. When detecting large objects,
                                             the                         by adding deformable subparts at two scales:
the same as,                 twice the
197 Deeper model and (2) twice the shallow model by start symbol Q. to subparts atat two scales:(1)
       the relative to Q Weextend the shallow model the adding deformable subparts two scales:
            same as, We extend the resolution of by adding relative When
       Deeper model parameters from p,t,r,u can movedeformableAp,t . detecting large objects,
high-resolution subparts capture fine terminal Aboundingwhen detecting small objects, high- (1)
          •sameas, andsubparts subpart resolutionofoftheHowever, box annotations objects, high-
               Trainandand twice the resolution details. However, Q. When detecting large objects,
198 the same as,
                                  the            image details.
       high-resolution (2) twice the fine image thestart symbol whenWhen detecting large objects,
       the                    (2)       capture                                start symbol Q. detecting small
resolution subparts cannot bep,t,H,ubecause they “fall off the bottom”octave below Zmap pyramid.
           The displacements            used place the symbols Wp,t,H,u one of the feature p,t in the feature map
199 high-resolution subparts capture fine because they “fall off the when detecting feature map pyramid.
       high-resolution subparts capture fine image details.However, bottom” of the small objects, high-
       resolution subparts cannot be used image details. However, when detecting small objects, high-
The model uses derivations with low-resolution subparts when detecting small objects.
       resolution Production costs place the symbols Wp,t,L,u at ofsmall objects.as Zp,t . We add
       Thepyramid. The displacements p,t,L,u they subparts when detecting the feature map pyramid.
             modelsubparts cannot be used because they “fall off the bottom” the same scale
200 resolution subparts cannot bewith low-resolution “fall off the bottom” of the feature map pyramid.
                            derivations used because
               - usesthe first two top-level parts (p = 1 and 2), with the number of subparts set to N1 = 3
       Thesubparts to
201 The model uses derivations with low-resolution subparts when above, and then objects. new pro-
We beginmodel uses derivations with low-resolution the grammar detecting small adding
              by replacing the productions from Yp,t in subparts when detecting small objects.
       We and N2 =replacing the productions from Yp,t in the grammarimproveand then adding new pro-
            begin by 2. We find that adding additional subparts does not above, detection performance.
202 We begin by that p indexes productions from YY inindexes subtypes. In the following schemas, pro-
ductions. Recall replacing the the top-level parts and t the grammar above, and then adding new pro-
       We begin Recall that p indexes the top-level parts and t indexes subtypes. and then adding schemas,
       ductions. by replacing the productions from p,t in the grammar above, In the following new
the ductions. Recall that p indexes u modelsparts and t indexes subtypes. L}, u 2 {1, . . . ,schemas,
     indices - (for resolution) and the top-level have the ranges: r 2 {H, In the following Np },
203 ductions. Recall that p indexes the u subpart)parts and t indexes subtypes. {H, L}, u 2 {1, . . . , Np },
                r Deformation (for                                     p,t
       the indices r (for resolution) and top-level   (for subpart) have the ranges: r 2 In the following schemas,
where N2.2 thernumber of subparts inudetection parthave the ranges: r 2 {H, L}, u 2 {1, . . . , N },
              is Inference and test and a (for subpart) Yp .
                                                    top-level
       the p Np r the resolution) time u (for subpart) part the
204 the indices is (for number of subparts in a top-level haveYp . ranges: r 2 {H, L}, u 2 {1, . . . , Np },
       whereindices (for resolution) and                                                                                                 p
205 where Np p is the number of subparts in a top-level part Y.p .
       where N is the ↵p,t · ( ) of subparts in a top-level part Yp
                          number
206
               - HOG! findingfor(!scoring derivations. At test time, because images may contain mul-
           Yp,t (!) involvesp,t · ( { Zhighterminals
           Inference            filters {
                                 ↵         )              )}
                  Yp,t (!) 0 p,t · ·( () ) p,t Zp,t (! compute the maximum scoring derivation rooted at Q(!), for
                                     !
           tiple instances of p,t object class, we
                               ↵↵  an                              )}
207        Zp,tYYp,t (!) This can be done Zp,t (! ) ) } ap,t,r,1 ), . . dynamic programming algorithm [11].
           each Zp,t (!) ! ! {Ap,t{ p,t (! p,t,r,1p,t,r,1 (!standard . ),W.p,t,r,Np (! p (!
                 (!)(!)
                  p,t 2 ⌦.           0
                                     !           (!),
                                               { {A W W using
                                                 Z efficiently }     (!                          ,                      p,t,r,Np )}
                  !                  !              p,t (!),                           p,t,r,1 . . , Wp,t,r,N                 p,t,r,Np )}
208                                 0)
                        ↵p,t,r,u · ( 0
  Tuesday, August 23, (!)
                Z 11           ↵p,t,r,u · ( ) {A (!), W
                                    !                                      (!                 ), . . . , W          (!               )}
Salient contours
                                      Figure 15: Running time of di↵erent search algorithms as a function of the problem siz
                                                 Each sample point indicates the average running time taken over 200 ran
                                                 inputs. In each case N = 20 and = 100. See text for discussion.
        • Curve(a,b) + Curve(b,c) --> Curve(a,c)
                                      Felzenszwalb & McAllester

                                                                                            b

                                                                                            t
                                                                   a                                     c

                                      Figure 16: A curve with endpoints (a, c) is formed by composing curves with endpo
                                                 (a, b) and (b, c). We assume that t         ⇡/2. The cost of the compositio
                                                 proportional to sin2 (t). This cost is scale invariant and encourages curves t
                                                 relatively straight.



                                      assume that these short curves are straight, and their weight depends only on the im
                                      data along the line segment from a to b. We use a data term, seg(a, b), that is zero if
                                      image gradient along pixels in ab is perpendicular to ab, and higher otherwise.
                                          Figure 17 gives a formal definition of the two rules in our model. The constants k1
                                      k2 specify the minimum and maximum length of the base case curves, while L is a cons

                                                                                           184
         Figure 20: An example where the most salient curve goes over locations with essentially no
                    local evidence for a the curve at those locations.

Tuesday, August 23, 11
Shapes / Regions
                             Random shapes

     Samples from stochastic context-free shape grammar

                           Example results




                          “Matching” to images
                          (samples from posterior)
                                                     33




Tuesday, August 23, 11
Processing pipeline

                                            Regions

               Pixels
                                                                       Objects

                                  Edges               Contours



        • Vision system have multiple processing stages

        • Compositional model: each stage builds structures by grouping
          structures from previous stages
             - Single parsing problem
             - Avoids intermediate decisions
               (high-level information influences low-level interpretations)
Tuesday, August 23, 11
Computation
      • Context-free or Context-sensitive?

      • Even context-free models lead to hard parsing problem

           - Too many constituents!
                                           GETIKDSWOWZQE
           - String of length n have O(n2) substrings

           - Images with n pixels have O(2n) regions




Tuesday, August 23, 11
Alternative parsing problems
    1. Whole image parsing                                                       room

          - Explains every pixel exactly once                             wall          floor

                                                                                            chest
                                                                      shelves    pictures
          - Hard

    2. Find light derivations within an image                  book      book     ...   book


          - Expansion of start symbol into terminals results
                                            Example

          - Explains part of the image

          - May explain the same pixel more then once

          - Efficient




Tuesday, August 23, 11
Computation
        • Bottom-up

             - Repeated grouping structures (KLD / A*LD)

        • Top-down

             - Repeated refining with backtracking (AO*)

        • Bottom-up + Top-down

             - Bottom-up computation guided by top-down influence

             - Coarse derivations provide heuristic guidance
               for finding finer structures (HA*LD)

Tuesday, August 23, 11
Coarse-to-fine
        • Model abstraction f : Si --> Si+1

             - lower resolution

             - coarsen labels
               horse --> animal --> piecewise smooth object
                                       Felzenszwalb & McAllester
        • Coarse computation guides finer computation

                               m 1    Edges       Contours           Recognition




                               1      Edges       Contours           Recognition



                               0      Edges       Contours           Recognition


          Figure 8: A vision system with several levels of processing. Forward arrows represent the
Tuesday, August 23, 11
Challenges
        • Whole image parsing (with context-free grammars)

             - Restrict possible constituents

             - LP relaxation

             - DDMCMC

        • Learn object grammars from weakly labeled data

             - PASCAL VOC

        • Build a complete processing pipeline unifying
          segmentation and recognition

Tuesday, August 23, 11

Contenu connexe

Similaire à Fcv rep felzenswalb

Fear and loathing with APL (oredev)
Fear and loathing with APL (oredev)Fear and loathing with APL (oredev)
Fear and loathing with APL (oredev)Yan Cui
 
Automatic Type Class Derivation with Shapeless
Automatic Type Class Derivation with ShapelessAutomatic Type Class Derivation with Shapeless
Automatic Type Class Derivation with Shapelessjcazevedo
 
The Logical Burrito - pattern matching, term rewriting and unification
The Logical Burrito - pattern matching, term rewriting and unificationThe Logical Burrito - pattern matching, term rewriting and unification
The Logical Burrito - pattern matching, term rewriting and unificationNorman Richards
 
A03401001005
A03401001005A03401001005
A03401001005theijes
 
Generalization of Compositons of Cellular Automata on Groups
Generalization of Compositons of Cellular Automata on GroupsGeneralization of Compositons of Cellular Automata on Groups
Generalization of Compositons of Cellular Automata on GroupsYoshihiro Mizoguchi
 
Character Tables in Chemistry
Character Tables in ChemistryCharacter Tables in Chemistry
Character Tables in ChemistryChris Sonntag
 
Category Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) picturesCategory Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) picturesAshwin Rao
 
Comparative study of results obtained by analysis of structures using ANSYS, ...
Comparative study of results obtained by analysis of structures using ANSYS, ...Comparative study of results obtained by analysis of structures using ANSYS, ...
Comparative study of results obtained by analysis of structures using ANSYS, ...IOSR Journals
 
Building Composable Abstractions
Building Composable AbstractionsBuilding Composable Abstractions
Building Composable AbstractionsEric Normand
 
Ten-page Brief Overview of Swift for Scala Developers
Ten-page Brief Overview of Swift for Scala DevelopersTen-page Brief Overview of Swift for Scala Developers
Ten-page Brief Overview of Swift for Scala Developersihji
 

Similaire à Fcv rep felzenswalb (20)

Fear and loathing with APL (oredev)
Fear and loathing with APL (oredev)Fear and loathing with APL (oredev)
Fear and loathing with APL (oredev)
 
Path analysis
Path analysisPath analysis
Path analysis
 
Automatic Type Class Derivation with Shapeless
Automatic Type Class Derivation with ShapelessAutomatic Type Class Derivation with Shapeless
Automatic Type Class Derivation with Shapeless
 
Prolog
PrologProlog
Prolog
 
The Logical Burrito - pattern matching, term rewriting and unification
The Logical Burrito - pattern matching, term rewriting and unificationThe Logical Burrito - pattern matching, term rewriting and unification
The Logical Burrito - pattern matching, term rewriting and unification
 
Delta Like Robot
Delta Like RobotDelta Like Robot
Delta Like Robot
 
A03401001005
A03401001005A03401001005
A03401001005
 
Generalization of Compositons of Cellular Automata on Groups
Generalization of Compositons of Cellular Automata on GroupsGeneralization of Compositons of Cellular Automata on Groups
Generalization of Compositons of Cellular Automata on Groups
 
Character Tables in Chemistry
Character Tables in ChemistryCharacter Tables in Chemistry
Character Tables in Chemistry
 
Character tables
Character tablesCharacter tables
Character tables
 
CSMR11b.ppt
CSMR11b.pptCSMR11b.ppt
CSMR11b.ppt
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
Ef24836841
Ef24836841Ef24836841
Ef24836841
 
Category Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) picturesCategory Theory made easy with (ugly) pictures
Category Theory made easy with (ugly) pictures
 
Comparative study of results obtained by analysis of structures using ANSYS, ...
Comparative study of results obtained by analysis of structures using ANSYS, ...Comparative study of results obtained by analysis of structures using ANSYS, ...
Comparative study of results obtained by analysis of structures using ANSYS, ...
 
Euler Getter
Euler GetterEuler Getter
Euler Getter
 
Building Composable Abstractions
Building Composable AbstractionsBuilding Composable Abstractions
Building Composable Abstractions
 
Ten-page Brief Overview of Swift for Scala Developers
Ten-page Brief Overview of Swift for Scala DevelopersTen-page Brief Overview of Swift for Scala Developers
Ten-page Brief Overview of Swift for Scala Developers
 
Computational models
Computational models Computational models
Computational models
 
Sobolev spaces
Sobolev spacesSobolev spaces
Sobolev spaces
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Dernier

Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Neil Kimberley
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...amitlee9823
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfAmzadHosen3
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒anilsa9823
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayNZSG
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communicationskarancommunications
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...Aggregage
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...rajveerescorts2022
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Delhi Call girls
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...amitlee9823
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Roland Driesen
 

Dernier (20)

Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdf
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒VIP Call Girls In Saharaganj ( Lucknow  ) 🔝 8923113531 🔝  Cash Payment (COD) 👒
VIP Call Girls In Saharaganj ( Lucknow ) 🔝 8923113531 🔝 Cash Payment (COD) 👒
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...Boost the utilization of your HCL environment by reevaluating use cases and f...
Boost the utilization of your HCL environment by reevaluating use cases and f...
 
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pillsMifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
Mifty kit IN Salmiya (+918133066128) Abortion pills IN Salmiyah Cytotec pills
 

Fcv rep felzenswalb

  • 1. Compositional models Pedro Felzenszwalb Brown University Tuesday, August 23, 11
  • 2. Deformable models • Can take us a long way... • But not all the way (a) (b) Tuesday, August 23, 11
  • 3. Structure variation • Object in rich categories have variable structure • These are NOT deformations • There is always something you never saw before • Mixture of deformable models? too many combined choices • Bag of words? not enough structure • Non-parametric? doesn’t generalize Tuesday, August 23, 11
  • 4. Structure variation • Object in rich categories have variable structure • These are NOT deformations • There is always something you never saw before • Mixture of deformable models? too many combined choices • Bag of words? Compositional model not enough structure • Non-parametric? doesn’t generalize Tuesday, August 23, 11
  • 5. Object detection grammars • Pictorial structure model with variable structure • Stochastic context-free grammar - Generates tree-structured model - Springs connect symbols along derivation tree - Appearance model associated with each terminal Tuesday, August 23, 11
  • 6. - person -> face, trunk, arms, lower-part - face -> hat, eyes, nose, mouth - face -> eyes, nose, mouth - hat -> baseball-cap - hat -> sombrero - lower-part -> shoe, shoe, legs - lower-part -> bare-foot, bare-foot, legs - legs -> pants - legs -> skirt Tuesday, August 23, 11
  • 7. Person detection grammar 108 Subtype 1 Subtype 2 Example detections and derived filters 109 110 Part 1 111 112 Part 2 113 Part 3 114 Part 4 115 116 Part 5 117 Part 6 118 Occluder Parts 1-6 (no occlusion) Parts 1-4 & occluder Parts 1-2 & occluder 119 120 Figure 1: Shallow grammar model. This figure illustrates a shallow version of our grammar model • 121 Instantiation includes six variable number of parts(“occluder”), each of which (Section 2.1). This model has a person parts and an occlusion model 122 comes in one of two subtypes. A detection places one subtype of each visible part at a location and 123 124 - scale in toand relative toderivation < 6are constrained by deformation penalties. 1,...,k move occluder if k does not place all parts it must place the occluder. Parts are allowed the image. If the each other but 125 126 • Parts can translateproductions specified by two kinds of schemas (a schema is a template for We consider models with relative to each other 127 generating productions). A structure schema specifies one production for each placement ! 2 ⌦, 128 • Parts have subtypes 129 X(!) s ! { Y1 (! 1 ), . . . , Yn (! n) }. (3) 130 131 Here the i specify constant displacements within the feature map pyramid. Structure schemas can • 132 Parts have deformable sub-parts other objects. be used to define decompositions of objects into (not shown) 133 Let be the set of possible displacements within a single scale of a feature map pyramid. A 134 deformation schema specifies one production for each placement ! 2 ⌦ and displacement 2 • Beats all other methods on PASCAL 2010 (49.5 AP) , 135 ↵· ( ) 136 Tuesday, August 23, 11 X(!) ! { Y (! ) }. (4)
  • 8. O(!) ! { Ot (!) } Ot (!) ! { At (! )} 175 176 subtypes The mixture start symbol [12] hassix alternate choices that derive component. The Part The grammar has a model from Q with two subtypes for each mixture people under varying de- Building the model Partare forced to be mixture modelof each has corresponding nonterminal Yp that is placed The subtypes The (occlusion). from [12] has two subtypes roughly mixture component. subtypesgrees of visibilitymirror imagesEach partotheraand correspond for each to left-facing people at some 177 Part subtypes The mixture model from [12] hasother subtypes for part, which to left-facing The subtypes are forced to be mirror images of each two and correspond roughly arecomponent. The subtypes The mixture model from [12] has two subtypes for each mixture also forced andPart ideal position relative to Q. Derivations with occlusion for each each mixture component.people 178 right-facing people. Our grammar model has two subtypesinclude the occlusion symbol O. A derivation to besubtypes aresubtypeto bedisplacement for eachhas two and correspond roughlywhich are also people andselects forced be Our grammar of each our grammar model,each part, toto grammar people right-facing people. mirror images of each other subtypes for roughly left-facing forced subtypes areaforced toand mirror images model other and correspondthe decisionleft-facing (production 179 mirror images of each other. But in the case of visible part. The parameters of theof which part subtypebe mirror imagesdetectiongrammar model has two our grammar each part, which procedure forced and right-facing people. each other. isandmodelcasefor each part.for model,part, decision ofalso described andto instantiate people.Our time But in the has of subtypes for each the which are which part to scores, deformation parameters independent two subtypes the discriminative are also forced right-facing at of Our grammar filters) are learned with 180 to be mirror images ofat detection timein the case of our grammar model, the decision of which part subtype to instantiate each other. But is independent for each part. to be in Section 4. Figure 1 illustrates the the case of our grammar model, the decision of which part mirror images of each other. But in filters in the resulting model and some example detections. 181 subtype Type grammar detection defined independent for each part. The indices p (for part), t subtypeperson in any non-recursive grammar • to instantiate detection time independent for each part. The shallowto instantiate atat model istime isis by the following grammar. 182 subtype), and person grammar modelshallow 2 {1,the. followingdeformableThe indices,p (forscales: t (1) (for TheDeeper model We following ranges: p model. . , 6}, t 2 {L, R} and ksubparts. at 5}. part), shallow k have the extend the is defined by by adding grammar. 2 {1, . . two Thethe sameperson grammar followingdefined by 2 {1, . . . ,symbol Q. R} andindices p.(for 5}. t t subtype), and k (2) twice the is ranges: p the start 6}, t 2 {L, The indices p . . part), (for shallow as, andhave the model resolution of the following grammar. The k 2 {1,(for , part), 183 The shallow person grammar model is defined by the following grammar. When detecting large objects, subtype), and have the following ranges: details. , 6}, 2 {L, R} and 2 {1, , 5}. (forhigh-resolution have kthe capture fine image p {1, . However, {L, R}detecting {1, . .. ,.5}. (for subtype), and kksubpartsfollowing ranges:. , Y 2 {1, .. .. ,. 6}, t t 2when ) and k k 2small..objects, high- s p 2(! 184 Q(!) ! sk Y1 (! { 1 ), . . k k ), O(! k+1 } resolution Q(!) Q(!) subparts ! sk{ Y (! cannot !be { Y1 (! ), they Yk (! off the O(! . . . , “fall ), bottom” k+1the feature map pyramid. of ) } sk 1 used because Y6 (! s6 1 k 185 ), . .),), .. .. ,. YY(! 6 ) } ),), O(! . ,. ! { 1 1 (! 1 1 ), . . . , k k (! when detecting ) ) } objects. The model usesQ(!) ! withY(! s6 1 186 derivations { { Ylow-resolution ,subparts k k )O(! Q(!) Q(!) ! Y 1 (! 1 Y6 (! 6 } k+1 } k+1 small s s6 Q(!) {!p,t (!)Y(! Y1 ),), .. .. ,. Y↵p,t! p,t ·) { } p,t (! Q(!) 0 6! { { 1}1 (! p,t (!) , Y(!· ↵ ) 6 6 ( A 0 6 6 (! ( )) We begin (!) replacing! productions from p,t (!) the grammar Ap,t (! and ) } adding new pro- ) } then Yp by ! Y Y 1 . } 187 Yp (!) the { Yp,t (!) } Y Yp,t in !) ) { above, 188 ductions. Recall that 0 {indexesp,t (!) } OtYYp,t (!) ·!↵indexes{subtypes. In )the}following schemas, YYp (!) 0 ! Yp,t (!) (!)parts ↵t ↵(↵p,t·! O(!)(!) ! ! O{(!) } top-levelp,t (!)and p,t! ({)At (!Ap,t (! p0 t { Y the } t )t ·( ( · )} ) Ot (!) ↵ · ·(! ) { Ar(! 0 { A } 189 pO(!) ! { Ot (!) } subpart) have↵the() ranges:p,t(! {H, L}, u 2 {1, . . . , Np }, the indices r (for resolution) and u (for t 2 )} 00 O(!)number Q subparts alternateOt (!) Y! start symbol { with(!) in Otchoices p! { AAt (! ) ) } The grammar haspais O(!) ! of { O(!) } } a top-level partthat derive{people under varying de- (!) t (! t t ! Ot t six } 190 Thewhere N has a start symbol Q with six alternate choices that derive people under varying de- grammar the . grees of visibility (occlusion). Each part has a corresponding nonterminal Yp that is placed at some 191 The grammar has a (occlusion). Each with has alternate choicesnonterminalpeople under varying de- ideal The grammar has aQ. Derivations with occlusion include the occlusion symbolthat A derivationsome grees of visibility start ↵p,t · ( ) Q with six alternate choices that derive people under varying de- position relative to start symbol Q part six a corresponding that derive Yp O. is placed at symbol 192 grees of visibility(!) to Q. !Each part p,t (! occlusion includenonterminal Yp that isO. A derivation )} selects a subtype and displacement for each visibleapart. The parametersocclusion symbolplaced atat some grees position p,t (occlusion). Each{part has corresponding nonterminal grammar (production ideal of visibility (occlusion). Y relative Derivations witha corresponding the of the Yp that is placed some Z has 193 ideal positionZ relative displacement forwith occlusion include parameters,of symbol (! AA derivation ideal position p,t (!) to Q. Derivations are(!), Wp,t,r,1 (!The p,t,r,1occlusionp,t,r,N O.described )} selects a subtype and to Q. and filters) each visiblewithinclude the ), . . . Wthe grammar (production relative Derivationsp,t learned part.the discriminative procedureO. derivation 0 {A with occlusion the occlusion symbol scores, deformation parameters ! p p,t,r,Np 194 selects a a subtype andparameters and filters) visible part. with parameters of the grammar (production selects subtype 1 illustrates the(filterseach visible part. The parameters example procedure described scores, deformationdisplacement for each are learned The the discriminative and displacement for of the grammar (production in Section 4. Wp,t,r,u (!) ↵p,t,r,u · ) {Ap,t,r,uresulting model and some Figure in the detections. 195 scores, deformation parameters and filters)in(! learned with the discriminative procedure described ! )} scores, deformation parameters and filters) are the resulting model and some example detections. in Section 4. Figure 1 illustrates the filters are learned with the discriminative procedure described DeeperSection 4. Figure 1 the shallowfilters inbythe resulting model and some examplescales: (1) in model We 196 in Sectionmodel extendillustrates shallow model resulting model and some example detections. (1) 4. Figure 1 illustrates the model in adding deformable subparts at two detections. the filters the Deepernote that Wein [22] our model has hierarchical deformations. The part terminal Ap,t can move We and (2) as extendresolution of the start symbol Q. When detecting large objects, the by adding deformable subparts at two scales: the same as, twice the 197 Deeper model and (2) twice the shallow model by start symbol Q. to subparts atat two scales:(1) the relative to Q Weextend the shallow model the adding deformable subparts two scales: same as, We extend the resolution of by adding relative When Deeper model parameters from p,t,r,u can movedeformableAp,t . detecting large objects, high-resolution subparts capture fine terminal Aboundingwhen detecting small objects, high- (1) •sameas, andsubparts subpart resolutionofoftheHowever, box annotations objects, high- Trainandand twice the resolution details. However, Q. When detecting large objects, 198 the same as, the image details. high-resolution (2) twice the fine image thestart symbol whenWhen detecting large objects, the (2) capture start symbol Q. detecting small resolution subparts cannot bep,t,H,ubecause they “fall off the bottom”octave below Zmap pyramid. The displacements used place the symbols Wp,t,H,u one of the feature p,t in the feature map 199 high-resolution subparts capture fine because they “fall off the when detecting feature map pyramid. high-resolution subparts capture fine image details.However, bottom” of the small objects, high- resolution subparts cannot be used image details. However, when detecting small objects, high- The model uses derivations with low-resolution subparts when detecting small objects. resolution Production costs place the symbols Wp,t,L,u at ofsmall objects.as Zp,t . We add Thepyramid. The displacements p,t,L,u they subparts when detecting the feature map pyramid. modelsubparts cannot be used because they “fall off the bottom” the same scale 200 resolution subparts cannot bewith low-resolution “fall off the bottom” of the feature map pyramid. derivations used because - usesthe first two top-level parts (p = 1 and 2), with the number of subparts set to N1 = 3 Thesubparts to 201 The model uses derivations with low-resolution subparts when above, and then objects. new pro- We beginmodel uses derivations with low-resolution the grammar detecting small adding by replacing the productions from Yp,t in subparts when detecting small objects. We and N2 =replacing the productions from Yp,t in the grammarimproveand then adding new pro- begin by 2. We find that adding additional subparts does not above, detection performance. 202 We begin by that p indexes productions from YY inindexes subtypes. In the following schemas, pro- ductions. Recall replacing the the top-level parts and t the grammar above, and then adding new pro- We begin Recall that p indexes the top-level parts and t indexes subtypes. and then adding schemas, ductions. by replacing the productions from p,t in the grammar above, In the following new the ductions. Recall that p indexes u modelsparts and t indexes subtypes. L}, u 2 {1, . . . ,schemas, indices - (for resolution) and the top-level have the ranges: r 2 {H, In the following Np }, 203 ductions. Recall that p indexes the u subpart)parts and t indexes subtypes. {H, L}, u 2 {1, . . . , Np }, r Deformation (for p,t the indices r (for resolution) and top-level (for subpart) have the ranges: r 2 In the following schemas, where N2.2 thernumber of subparts inudetection parthave the ranges: r 2 {H, L}, u 2 {1, . . . , N }, is Inference and test and a (for subpart) Yp . top-level the p Np r the resolution) time u (for subpart) part the 204 the indices is (for number of subparts in a top-level haveYp . ranges: r 2 {H, L}, u 2 {1, . . . , Np }, whereindices (for resolution) and p 205 where Np p is the number of subparts in a top-level part Y.p . where N is the ↵p,t · ( ) of subparts in a top-level part Yp number 206 - HOG! findingfor(!scoring derivations. At test time, because images may contain mul- Yp,t (!) involvesp,t · ( { Zhighterminals Inference filters { ↵ ) )} Yp,t (!) 0 p,t · ·( () ) p,t Zp,t (! compute the maximum scoring derivation rooted at Q(!), for ! tiple instances of p,t object class, we ↵↵ an )} 207 Zp,tYYp,t (!) This can be done Zp,t (! ) ) } ap,t,r,1 ), . . dynamic programming algorithm [11]. each Zp,t (!) ! ! {Ap,t{ p,t (! p,t,r,1p,t,r,1 (!standard . ),W.p,t,r,Np (! p (! (!)(!) p,t 2 ⌦. 0 ! (!), { {A W W using Z efficiently } (! , p,t,r,Np )} ! ! p,t (!), p,t,r,1 . . , Wp,t,r,N p,t,r,Np )} 208 0) ↵p,t,r,u · ( 0 Tuesday, August 23, (!) Z 11 ↵p,t,r,u · ( ) {A (!), W ! (! ), . . . , W (! )}
  • 9. Salient contours Figure 15: Running time of di↵erent search algorithms as a function of the problem siz Each sample point indicates the average running time taken over 200 ran inputs. In each case N = 20 and = 100. See text for discussion. • Curve(a,b) + Curve(b,c) --> Curve(a,c) Felzenszwalb & McAllester b t a c Figure 16: A curve with endpoints (a, c) is formed by composing curves with endpo (a, b) and (b, c). We assume that t ⇡/2. The cost of the compositio proportional to sin2 (t). This cost is scale invariant and encourages curves t relatively straight. assume that these short curves are straight, and their weight depends only on the im data along the line segment from a to b. We use a data term, seg(a, b), that is zero if image gradient along pixels in ab is perpendicular to ab, and higher otherwise. Figure 17 gives a formal definition of the two rules in our model. The constants k1 k2 specify the minimum and maximum length of the base case curves, while L is a cons 184 Figure 20: An example where the most salient curve goes over locations with essentially no local evidence for a the curve at those locations. Tuesday, August 23, 11
  • 10. Shapes / Regions Random shapes Samples from stochastic context-free shape grammar Example results “Matching” to images (samples from posterior) 33 Tuesday, August 23, 11
  • 11. Processing pipeline Regions Pixels Objects Edges Contours • Vision system have multiple processing stages • Compositional model: each stage builds structures by grouping structures from previous stages - Single parsing problem - Avoids intermediate decisions (high-level information influences low-level interpretations) Tuesday, August 23, 11
  • 12. Computation • Context-free or Context-sensitive? • Even context-free models lead to hard parsing problem - Too many constituents! GETIKDSWOWZQE - String of length n have O(n2) substrings - Images with n pixels have O(2n) regions Tuesday, August 23, 11
  • 13. Alternative parsing problems 1. Whole image parsing room - Explains every pixel exactly once wall floor chest shelves pictures - Hard 2. Find light derivations within an image book book ... book - Expansion of start symbol into terminals results Example - Explains part of the image - May explain the same pixel more then once - Efficient Tuesday, August 23, 11
  • 14. Computation • Bottom-up - Repeated grouping structures (KLD / A*LD) • Top-down - Repeated refining with backtracking (AO*) • Bottom-up + Top-down - Bottom-up computation guided by top-down influence - Coarse derivations provide heuristic guidance for finding finer structures (HA*LD) Tuesday, August 23, 11
  • 15. Coarse-to-fine • Model abstraction f : Si --> Si+1 - lower resolution - coarsen labels horse --> animal --> piecewise smooth object Felzenszwalb & McAllester • Coarse computation guides finer computation m 1 Edges Contours Recognition 1 Edges Contours Recognition 0 Edges Contours Recognition Figure 8: A vision system with several levels of processing. Forward arrows represent the Tuesday, August 23, 11
  • 16. Challenges • Whole image parsing (with context-free grammars) - Restrict possible constituents - LP relaxation - DDMCMC • Learn object grammars from weakly labeled data - PASCAL VOC • Build a complete processing pipeline unifying segmentation and recognition Tuesday, August 23, 11