SlideShare a Scribd company logo
1 of 27
Download to read offline
Learning structured representations


           Deva Ramanan
             UC Irvine
fw (x) =Traini
                                                                                      w·       •
           Visual            representations
                               • Training data consists of images with labeled
                                                                             N
                               • Need to learn the model structure, filters and d               •
                                                positives                         negatives
                                        Learned model
                                          Training
                                               fw (x) = w · Φ(x)
                    •   Training data consists of images with labeled bounding boxes
                                                                                          Training
                    •   Need to learn the model structure, filters and deformation costs




                                                        Training




Geometric models                                                 positive    negative
                                                         Statistical classifiers
 (1970s-1990s)                                             (1990s-present)weights
                                                                 weights

                                                  Large-scale training
Hand-coded models
                                            Appearance-based representations
Learned model
Learned visual fw (x) = w · Φ(x)
               representations
                           Training
     •   Training data consists of images with labeled bounding boxes

     •   Need Wherethe invariance built in? deformation costs
              to learn is model structure, filters and

                          Representation
                       (linear classifier, ...)

                                             Training




                            Features
ViolaJones                                              Dalal Triggs
                                                         positive       nega
                                                         weights        wei
Learned visual representations
                                        Where is invariance built in?
                                                              4                                                     4
                                                       4       4
                                                               4                                          4          4
                                                                                                                     4
                                                       Representation
                                                               4                                                     4
                                                  (latent-variable classifier)




                                                           Features
               (a)           (b)           (c)
      (a)       (a) (b)
                (a)
                (a)
                              (b) (c)
                              (b)
                              (b)
                                            (c)
                                            (c)
                                            (c)
                                                                 (a)           (b)      (c)
                                                                                Felzenszwalb      et al 09
                                                        (a)       (a) (b)
                                                                  (a)           (b) (c)
                                                                                (b)      (c)
                                                                                         (c)
on model. The model is defined by a coarse root filter (a), several (a)           (b)      (c)
ections obtained withby single by a coarse root filter (a), The model is defined by a coarse (b) filter (a), several
on model. The defined isa defined component person model.several
on model. The model is defined byroot filter root several several
 The model is             a coarse          (a),
on model. The model is defined by a coarse root filter (a), several
               model               a coarse      filter (a),
                                                                          (a)              root              (c)
btained with each with relative tocomponent personfilters specifydefined is defined byroot filter root several several
e locationobtained with a single component person model. The model is defined by a coarse root filter (a), several
 tections obtained partcomponent the root model. The modelThe model by a coarse a coarse (a), filter (a),
             of a single                 person (c). The               is
 tections obtained with relative to(c). The filtersThe filters specify model is defined by a coarse root filter (a), several
    location of each part a a root component person model. The
                            a single model (c). specify
 tections part relative andthe spatialthe root for the location of each part relative to the root (c). The filters specify
                               single                         model.
  eof each filters (b) to relative to the root (c). The filters specify
ution part of each part relative to the root (c). The filters specify
  e location
 isualization of each(b) positive spatial model for the location of relative to relative to(c). The filtersThe filters specify
  e location and a spatial model for thedifferent orientations. The
         (b) show part
 ution part filters the andaa single model for theof each part each part relative to the defined The a coarse root filte
  filters obtained with a weights at location person model. The model is root (c). by
 ions part ofshow (b) positivespatial component location of each part relative to theatroot (c). The filters specify
visualizationfilters the and a at weights atorientations. location of each part
 ution part filters (b) positivespatial model for the The
                          and       different different orientations. The
visualization show the positive weights at different orientations. The
histogram show the gradients features. Their visualization The
                oriented
                                                                                 the root
                                                                                             the different
                                                                                                           specify
 ution the positive weights weights at different orientations. show the positive weights root (c). orientations. The
 n show                                                                                                        filters specify
  ingorientedof oriented gradients features. Their visualization show the positive different different orientations. model. T
      the center of a part at different1.
 histogram of oriented gradients
 of                              Fig.features. Their visualization show the positive weightscomponent person The
                                                Detections the root.
                                                                 obtained with a single at different orientations. The
visualization gradients features. Their visualization show the positive weights at weights atorientations. The
                                          locations relative to the root.
 histogrampart of a part at different locations the root.
enter the acenterat different locations relative Their visualization show the of eachweightsrelative toorientations. (c). The fi
cing the center of (b) anddifferent locations relative tothe root.
 n part center of a part at different “cost” to relative to the location positive part at different the root The
   of of
cing the filters a part at a the locations placing
 histogram of models reflects spatial model for the center
cingthe spatialoriented gradients features. of relative to the root. of a part at different locations relative to the root.
person                           bottle

           Where does learning fit in?
Training                                  Alg              Ground
images                                   output             truth

                Matching                                         17



                  alg
                                         cat
                       person                  bottle




 Tune parameters ( ,            ) till desired output on training set

      ‘Graduate Student Descent’ might take a while
                 (phrase from Marshall Tappen)
                                cat
5 years of PASCAL people detection
                                                                   Matching results

             50
            37.5
 average
             25
precision
            12.5
              0
                   05
                        06
                             07
                                  08
                                       09
                                            10
                                                 (after non-maximum suppression)
                   20
                        20
                             20
                                  20
                                       20
                                            20
                                                    ~1 second to search all scales


                             1% to 47% in 5 years

               How do we move beyond the plateau?
How do we move beyond the plateau?

1. Develop more structured models with less invariant features
Invariance vs Search

    Projective Invariants




    View-Based Mixtures
person            person
                      person                            person bottle
                                                        person bottle
     person
                         person                            person        bottle
     person                                                 bottle
                                                           bottle
Invariance vs Parametric Search
     person              person
                         person
                                                        person
                                                          bottle
                                                       person
                                                                           bottle
                                                                          bottle

              Part-Based Models




                                                            cat             cat
                             cat
                              cat                      4
                             cat                   4    4
                                                        4
                                                        4   cat           cat
                                                                        cat
                                                            cat cat
                                                                 cat           c
                                    cat
                                     cat



                      (a)        (b)        (c)
                (a)    (a) (b)
                       (a)        (b) (c)
                                  (b)        (c)
                                             (c)
                       (a)        (b)        (c)
Learned visual representations
                  Where is invariance built in?

                             Representation
                       (latent-variable classifier)




                              Features
Yi & Ramanan 11



                    Buffy performance: 88% vs 73%
Qualitative Results
How do we move beyond the plateau?

1. Develop more structured models with less invariant features


2. Score syntax as semantics
The forgotten challenge....




!"#$%&#

 '()*+"&,)-#.*/)&,*$#012*-"&"3&)4#*&4501"-*)1*)&,"4*-5&5
   678)4-*+"&,)-*-)"#*1)&*5&&"+9&*&)*-"&"3&*8""&
                                  Head Hand ;))&
                                  :"5- :51- Foot
             <=>?=@A:$+51@5B)$&   CDED    FEF   GEH
                   6I;6!JAK<J     LHEC   GMED   MEM
ure 8: Top: heat equilibrium for two bones. Bottom: the result
otating the right bone with the heat-based attachment
                                                                  Structured classifiers
                                                                          Figure 10: A centaur pirate with a centaur skeleton embedded looks
                                                                          at a cat with a quadruped skeleton embedded
  the character volume as an insulated heat-conducting body and
e the temperature of bone i to be 1 while keeping the tempera-
 of all of the other bones at 0. Then we can take the equilibrium
perature at each vertex on the surface as the weight of bone i at
 vertex. Figure 8 illustrates this in two dimensions.
 olving for heat equilibrium over a volume would require tes-
ating the volume and would be slow. Therefore, for simplic-
Pinocchio solves for equilibrium over the surface only, but at
 e vertices, it adds the heat transferred from the nearest bone.
                                                              i
  equilibrium over the surface for bone i is given by ∂w = ∂t
 i
   + H(pi − wi ) = 0, which can be written as

                   −∆wi + Hwi = Hpi ,                          (1)

 re ∆ is the discrete surface Laplacian, calculated with the
 ngent formula [Meyer et al. 2003], pi is a vector with pi = 1
                                                             j
 e nearest bone to vertex j is i and pi = 0 otherwise, and H is
                                                                                                     shape
                                                                          Figure 11: The human scan on the left is rigged by Pinocchio and is
                                                                          posed on the right by changing joint angles in the embedded skele-
                                                                          ton. The well-known deficiencies of LBS can be seen in the right
                                                                                                                                                 Estimated
                                                                                                                                                   shape
                                       j
diagonal matrix with Hjj being the heat contribution weight of            knee and hip areas.
nearest bone to vertex j. Because ∆ has units of length−2 , so
 t H. Letting d(j) be the distance from vertex j to the nearest
e, Pinocchio uses Hjj = c/d(j)2 if the shortest line segment              5.1 Generality
m the vertex to the bone is contained in the character volume             Figure 9 shows our 16 test characters and the skeletons Pinocchio
 Hjj = 0 if it is not. It uses the precomputed distance field to           embedded. The skeleton was correctly embedded into 13 of these



                                                                                         classifier
 rmine whether a line segment is entirely contained in the char-          models (81% success). For Models 7, 10 and 13, a hint for a single
 r volume. For c ≈ 0.22, this method gives weights with similar           joint was sufficient to produce a good embedding.
sitions to those computed by finding the equilibrium over the                 These tests demonstrate the range of proportions that our method
 me. Pinocchio uses c = 1 (corresponding to anisotropic heat              can tolerate: we have a well-proportioned human (Models 1–4, 8),
usion) because the results look more natural. When k bones are            large arms and tiny legs (6; in 10, this causes problems), and large
 distant from vertex j, heat contributions from all of them are           legs and small arms (15; in 13, the small arms cause problems). For
d: pj is 1/k for all of them, and Hjj = kc/d(j)2 .                        other characters we tested, skeletons were almost always correctly
 quation (1) is a sparse linear system, and the left hand side            embedded into well-proportioned characters whose pose matched

                                                                                                                                                  Estimated
rix −∆ + H does not depend on i, the bone we are interested               the given skeleton. Pinocchio was even able to transfer a biped
Thus we can factor the system once and back-substitute to find             walk onto a human hand, a cat on its hind legs, and a donut.
weights for each bone. Botsch et al. [2005] show how to use                  The most common issues we ran into on other characters were:
 arse Cholesky solver to compute the factorization for this kind
 ystem. Pinocchio uses the TAUCS [Toledo 2003] library for
 computation. Note also that the weights wi sum to 1 for each
                                                                                              reflectance
                                                                            • The thinnest limb into which we may hope to embed a bone
                                                                              has a radius of 2τ . Characters with extremely thin limbs often    reflectance
                                                                              fail because the the graph we extract is disconnected. Reduc-
ex: if we sum (1) over i, we get (−∆ + H) i wi = H · 1,
                                                P
                                                                              ing τ , however, hurts performance.
ch yields i wi = 1.
           P
  is possible to speed up this method slightly by finding vertices           • Degree 2 joints such as knees and elbows are often positioned
  are unambiguously attached to a single bone and forcing their               incorrectly within a limb. We do not know of a reliable way
ght to 1. An earlier variant of our algorithm did this, but the im-           to identify the right locations for them: on some characters
  ement was negligible, and this introduced occasional artifacts.             they are thicker than the rest of the limb, and on others they
                                                                              are thinner.
 Results                                                                    Although most of our tests were done with the biped skeleton,
 evaluate Pinocchio with respect to the three criteria stated in          we have also used other skeletons for other characters (Figure 10).
introduction: generality, quality, and performance. To ensure
 bjective evaluation, we use inputs that were not used during             5.2 Quality
elopment. To this end, once the development was complete, we              Figure 11 shows the results of manually posing a human scan us-
ed Pinocchio on 16 biped Cosmic Blobs models that we had not              ing our attachment. Our video [Baran and Popovi´ 2007b] demon-
                                                                                                                            c
 iously tried.                                                            strates the quality of the animation produced by Pinocchio.



                                                                      6
Lead: Jitendra Malik (UC Berkeley)
                 Structured object reports
  Participants: Deva Ramanan (UC Irvine), Steve Seitz (U Washington




duction/goal: Human detection and pose estimation are tasks with many applicat
ng next-generation human-computer interfaces and activity understanding. Detection
                      “If you’re not winning the game, change the rules”
 s a classification problem (does this window contain a person or not?), while pose es
en cast as a regression problem, where given an image or sequence of frames, one m
oint angles. This project will take a more general view and cast both tasks as one of “p
e a full syntactic parse will report the number of people present (if any), their body
Lead: J
Caveat: we need                               more pixels Rama
                                              Participants: Deva

                              Multiresolution models for object d
                               Dennis Park                 Deva Ramanan                     Charless Fowlkes

         Motivation & Goal                                                                      S3. Now we re
           Objects in images come with various resolutions.                                      star model
           Most recognition systems are scale-invariant,                                           eliminate bl
           i.e. fixed-size template
                                                                                                 LR global tem
           More pixels mean more information!
                                                                                                   naturally fits
           We want to use the information when it is avail-
                                                                                                   LR template
           able.
                                                                                                   HR templat
                Test image                                                                       trained by La
                                    Goal :                                                         part locatio
                                    1. We want to use more pixels.
                                    2. We want to detect small instances as well.
                                    3. In addition, we try to address the correlation be-         Φ(x, s, z) =
                                       tween resolution and the role of context.



                           Introduction/goal: Human scoring funct
  We should focus on high-resolution data
         Model
                                                           detect
                           cluding next-generation human-com=
   (in contrast to most learning methods)
          Building blocks
                                                              f (x, s)

          HOG features [1]
          SVM              cast as a classification problem &(does
                                                         S4. final mod
                                                          The boundar
Caltech Pedestrian Benchmark
                             missed
     10
d detections                 detections
               Multiresolution model




 , we show the result of our low-resolution rigid-template baseline.
                                                              Park et al. 2010
s to detect large instances. On the right, we show detections of
, part-based baseline, which fails to find small instances. On the
detections of our multiresolution model that is able to detect both
tances. The threshold of each model is set todecrease same rate of
       Multiresolution representations yield the error by 2X           compared to previous work
How do we move beyond the plateau?

1. Develop more structured models with less invariant features


2. Score syntax as semantics


3. Generate ground-truth datasets of structured labels
Case study: small or big parts




Skeleton   Parts/Poselets   Mini-parts
What are good representations?

            Exemplars
               Parts
             Attributes
           Visual Phrases
             Grammars
                  ?
Even worse: what are the parts
         (if any)?
     Is there any structure to label here?
Sharing surfaces?
Selective parameter sharing


                                              v
           v                v




Exemplars => Parts => Attributes => Grammars

   Multi-task training of instance-specific classifiers
Human-in-the-loop structure learning
How do we move beyond the plateau?

1. Develop more structured models with less invariant features


2. Score “nuisance” variables as meaningful output


3. Generate ground-truth datasets of structured labels
Diagram for Eero

       Machine Learning


      Vision




Vision as applied machine learning
Diagram for Eero
                  Vision




    Graphics               Machine Learning
(shape & appearance)

    Vision as structured pattern recognition

More Related Content

What's hot

Ph d colloquium 2
Ph d colloquium 2Ph d colloquium 2
Ph d colloquium 2Rishi Roy
 
COLLADA to WebGL (GDC 2013 presentation)
COLLADA to WebGL (GDC 2013 presentation)COLLADA to WebGL (GDC 2013 presentation)
COLLADA to WebGL (GDC 2013 presentation)Remi Arnaud
 
Video Copy Detection Using Inclined Video Tomography and Bag-of-Visual-Words
Video Copy Detection Using Inclined Video Tomography and Bag-of-Visual-WordsVideo Copy Detection Using Inclined Video Tomography and Bag-of-Visual-Words
Video Copy Detection Using Inclined Video Tomography and Bag-of-Visual-WordsWesley De Neve
 
7806 java 6 programming essentials using helios eclipse
7806 java 6 programming essentials using helios eclipse7806 java 6 programming essentials using helios eclipse
7806 java 6 programming essentials using helios eclipsebestip
 
Reconsidering Custom Memory Allocation
Reconsidering Custom Memory AllocationReconsidering Custom Memory Allocation
Reconsidering Custom Memory AllocationEmery Berger
 
Ten Commandments of Formal Methods: A decade later
Ten Commandments of Formal Methods: A decade laterTen Commandments of Formal Methods: A decade later
Ten Commandments of Formal Methods: A decade laterJonathan Bowen
 

What's hot (6)

Ph d colloquium 2
Ph d colloquium 2Ph d colloquium 2
Ph d colloquium 2
 
COLLADA to WebGL (GDC 2013 presentation)
COLLADA to WebGL (GDC 2013 presentation)COLLADA to WebGL (GDC 2013 presentation)
COLLADA to WebGL (GDC 2013 presentation)
 
Video Copy Detection Using Inclined Video Tomography and Bag-of-Visual-Words
Video Copy Detection Using Inclined Video Tomography and Bag-of-Visual-WordsVideo Copy Detection Using Inclined Video Tomography and Bag-of-Visual-Words
Video Copy Detection Using Inclined Video Tomography and Bag-of-Visual-Words
 
7806 java 6 programming essentials using helios eclipse
7806 java 6 programming essentials using helios eclipse7806 java 6 programming essentials using helios eclipse
7806 java 6 programming essentials using helios eclipse
 
Reconsidering Custom Memory Allocation
Reconsidering Custom Memory AllocationReconsidering Custom Memory Allocation
Reconsidering Custom Memory Allocation
 
Ten Commandments of Formal Methods: A decade later
Ten Commandments of Formal Methods: A decade laterTen Commandments of Formal Methods: A decade later
Ten Commandments of Formal Methods: A decade later
 

Viewers also liked

Fcv appli science_fergus
Fcv appli science_fergusFcv appli science_fergus
Fcv appli science_ferguszukun
 
Fcv acad ind_martin
Fcv acad ind_martinFcv acad ind_martin
Fcv acad ind_martinzukun
 
Fcv hum mach_perona
Fcv hum mach_peronaFcv hum mach_perona
Fcv hum mach_peronazukun
 
Fcv appli science_perona
Fcv appli science_peronaFcv appli science_perona
Fcv appli science_peronazukun
 
Fcv acad ind_lowe
Fcv acad ind_loweFcv acad ind_lowe
Fcv acad ind_lowezukun
 
02 cv mil_intro_to_probability
02 cv mil_intro_to_probability02 cv mil_intro_to_probability
02 cv mil_intro_to_probabilityzukun
 
Fcv learn fergus
Fcv learn fergusFcv learn fergus
Fcv learn ferguszukun
 
Fcv the revolution will be curated: human in the loop fine grained visual cat...
Fcv the revolution will be curated: human in the loop fine grained visual cat...Fcv the revolution will be curated: human in the loop fine grained visual cat...
Fcv the revolution will be curated: human in the loop fine grained visual cat...zukun
 

Viewers also liked (8)

Fcv appli science_fergus
Fcv appli science_fergusFcv appli science_fergus
Fcv appli science_fergus
 
Fcv acad ind_martin
Fcv acad ind_martinFcv acad ind_martin
Fcv acad ind_martin
 
Fcv hum mach_perona
Fcv hum mach_peronaFcv hum mach_perona
Fcv hum mach_perona
 
Fcv appli science_perona
Fcv appli science_peronaFcv appli science_perona
Fcv appli science_perona
 
Fcv acad ind_lowe
Fcv acad ind_loweFcv acad ind_lowe
Fcv acad ind_lowe
 
02 cv mil_intro_to_probability
02 cv mil_intro_to_probability02 cv mil_intro_to_probability
02 cv mil_intro_to_probability
 
Fcv learn fergus
Fcv learn fergusFcv learn fergus
Fcv learn fergus
 
Fcv the revolution will be curated: human in the loop fine grained visual cat...
Fcv the revolution will be curated: human in the loop fine grained visual cat...Fcv the revolution will be curated: human in the loop fine grained visual cat...
Fcv the revolution will be curated: human in the loop fine grained visual cat...
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 

Recently uploaded

Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentationuneakwhite
 
PHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation FinalPHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation FinalPanhandleOilandGas
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Centuryrwgiffor
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...lizamodels9
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756dollysharma2066
 
JAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
JAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLJAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
JAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLkapoorjyoti4444
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataExhibitors Data
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...amitlee9823
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...Aggregage
 
Call Girls In Majnu Ka Tilla 959961~3876 Shot 2000 Night 8000
Call Girls In Majnu Ka Tilla 959961~3876 Shot 2000 Night 8000Call Girls In Majnu Ka Tilla 959961~3876 Shot 2000 Night 8000
Call Girls In Majnu Ka Tilla 959961~3876 Shot 2000 Night 8000dlhescort
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Business Model Canvas (BMC)- A new venture concept
Business Model Canvas (BMC)-  A new venture conceptBusiness Model Canvas (BMC)-  A new venture concept
Business Model Canvas (BMC)- A new venture conceptP&CO
 
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876dlhescort
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptxnandhinijagan9867
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityEric T. Tung
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...daisycvs
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 

Recently uploaded (20)

Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentation
 
PHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation FinalPHX May 2024 Corporate Presentation Final
PHX May 2024 Corporate Presentation Final
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
Russian Call Girls In Gurgaon ❤️8448577510 ⊹Best Escorts Service In 24/7 Delh...
 
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
(Anamika) VIP Call Girls Napur Call Now 8617697112 Napur Escorts 24x7
 
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Majnu Ka Tilla, Delhi Contact Us 8377877756
 
JAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
JAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLJAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
JAYNAGAR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
Call Girls In Majnu Ka Tilla 959961~3876 Shot 2000 Night 8000
Call Girls In Majnu Ka Tilla 959961~3876 Shot 2000 Night 8000Call Girls In Majnu Ka Tilla 959961~3876 Shot 2000 Night 8000
Call Girls In Majnu Ka Tilla 959961~3876 Shot 2000 Night 8000
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Hebbal Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Business Model Canvas (BMC)- A new venture concept
Business Model Canvas (BMC)-  A new venture conceptBusiness Model Canvas (BMC)-  A new venture concept
Business Model Canvas (BMC)- A new venture concept
 
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
Cheap Rate Call Girls In Noida Sector 62 Metro 959961乂3876
 
Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League City
 
Falcon Invoice Discounting platform in india
Falcon Invoice Discounting platform in indiaFalcon Invoice Discounting platform in india
Falcon Invoice Discounting platform in india
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 

Fcv learn ramanan

  • 1. Learning structured representations Deva Ramanan UC Irvine
  • 2. fw (x) =Traini w· • Visual representations • Training data consists of images with labeled N • Need to learn the model structure, filters and d • positives negatives Learned model Training fw (x) = w · Φ(x) • Training data consists of images with labeled bounding boxes Training • Need to learn the model structure, filters and deformation costs Training Geometric models positive negative Statistical classifiers (1970s-1990s) (1990s-present)weights weights Large-scale training Hand-coded models Appearance-based representations
  • 3. Learned model Learned visual fw (x) = w · Φ(x) representations Training • Training data consists of images with labeled bounding boxes • Need Wherethe invariance built in? deformation costs to learn is model structure, filters and Representation (linear classifier, ...) Training Features ViolaJones Dalal Triggs positive nega weights wei
  • 4. Learned visual representations Where is invariance built in? 4 4 4 4 4 4 4 4 Representation 4 4 (latent-variable classifier) Features (a) (b) (c) (a) (a) (b) (a) (a) (b) (c) (b) (b) (c) (c) (c) (a) (b) (c) Felzenszwalb et al 09 (a) (a) (b) (a) (b) (c) (b) (c) (c) on model. The model is defined by a coarse root filter (a), several (a) (b) (c) ections obtained withby single by a coarse root filter (a), The model is defined by a coarse (b) filter (a), several on model. The defined isa defined component person model.several on model. The model is defined byroot filter root several several The model is a coarse (a), on model. The model is defined by a coarse root filter (a), several model a coarse filter (a), (a) root (c) btained with each with relative tocomponent personfilters specifydefined is defined byroot filter root several several e locationobtained with a single component person model. The model is defined by a coarse root filter (a), several tections obtained partcomponent the root model. The modelThe model by a coarse a coarse (a), filter (a), of a single person (c). The is tections obtained with relative to(c). The filtersThe filters specify model is defined by a coarse root filter (a), several location of each part a a root component person model. The a single model (c). specify tections part relative andthe spatialthe root for the location of each part relative to the root (c). The filters specify single model. eof each filters (b) to relative to the root (c). The filters specify ution part of each part relative to the root (c). The filters specify e location isualization of each(b) positive spatial model for the location of relative to relative to(c). The filtersThe filters specify e location and a spatial model for thedifferent orientations. The (b) show part ution part filters the andaa single model for theof each part each part relative to the defined The a coarse root filte filters obtained with a weights at location person model. The model is root (c). by ions part ofshow (b) positivespatial component location of each part relative to theatroot (c). The filters specify visualizationfilters the and a at weights atorientations. location of each part ution part filters (b) positivespatial model for the The and different different orientations. The visualization show the positive weights at different orientations. The histogram show the gradients features. Their visualization The oriented the root the different specify ution the positive weights weights at different orientations. show the positive weights root (c). orientations. The n show filters specify ingorientedof oriented gradients features. Their visualization show the positive different different orientations. model. T the center of a part at different1. histogram of oriented gradients of Fig.features. Their visualization show the positive weightscomponent person The Detections the root. obtained with a single at different orientations. The visualization gradients features. Their visualization show the positive weights at weights atorientations. The locations relative to the root. histogrampart of a part at different locations the root. enter the acenterat different locations relative Their visualization show the of eachweightsrelative toorientations. (c). The fi cing the center of (b) anddifferent locations relative tothe root. n part center of a part at different “cost” to relative to the location positive part at different the root The of of cing the filters a part at a the locations placing histogram of models reflects spatial model for the center cingthe spatialoriented gradients features. of relative to the root. of a part at different locations relative to the root.
  • 5. person bottle Where does learning fit in? Training Alg Ground images output truth Matching 17 alg cat person bottle Tune parameters ( , ) till desired output on training set ‘Graduate Student Descent’ might take a while (phrase from Marshall Tappen) cat
  • 6. 5 years of PASCAL people detection Matching results 50 37.5 average 25 precision 12.5 0 05 06 07 08 09 10 (after non-maximum suppression) 20 20 20 20 20 20 ~1 second to search all scales 1% to 47% in 5 years How do we move beyond the plateau?
  • 7. How do we move beyond the plateau? 1. Develop more structured models with less invariant features
  • 8. Invariance vs Search Projective Invariants View-Based Mixtures
  • 9. person person person person bottle person bottle person person person bottle person bottle bottle Invariance vs Parametric Search person person person person bottle person bottle bottle Part-Based Models cat cat cat cat 4 cat 4 4 4 4 cat cat cat cat cat cat c cat cat (a) (b) (c) (a) (a) (b) (a) (b) (c) (b) (c) (c) (a) (b) (c)
  • 10. Learned visual representations Where is invariance built in? Representation (latent-variable classifier) Features Yi & Ramanan 11 Buffy performance: 88% vs 73%
  • 12. How do we move beyond the plateau? 1. Develop more structured models with less invariant features 2. Score syntax as semantics
  • 13. The forgotten challenge.... !"#$%&# '()*+"&,)-#.*/)&,*$#012*-"&"3&)4#*&4501"-*)1*)&,"4*-5&5 678)4-*+"&,)-*-)"#*1)&*5&&"+9&*&)*-"&"3&*8""& Head Hand ;))& :"5- :51- Foot <=>?=@A:$+51@5B)$& CDED FEF GEH 6I;6!JAK<J LHEC GMED MEM
  • 14. ure 8: Top: heat equilibrium for two bones. Bottom: the result otating the right bone with the heat-based attachment Structured classifiers Figure 10: A centaur pirate with a centaur skeleton embedded looks at a cat with a quadruped skeleton embedded the character volume as an insulated heat-conducting body and e the temperature of bone i to be 1 while keeping the tempera- of all of the other bones at 0. Then we can take the equilibrium perature at each vertex on the surface as the weight of bone i at vertex. Figure 8 illustrates this in two dimensions. olving for heat equilibrium over a volume would require tes- ating the volume and would be slow. Therefore, for simplic- Pinocchio solves for equilibrium over the surface only, but at e vertices, it adds the heat transferred from the nearest bone. i equilibrium over the surface for bone i is given by ∂w = ∂t i + H(pi − wi ) = 0, which can be written as −∆wi + Hwi = Hpi , (1) re ∆ is the discrete surface Laplacian, calculated with the ngent formula [Meyer et al. 2003], pi is a vector with pi = 1 j e nearest bone to vertex j is i and pi = 0 otherwise, and H is shape Figure 11: The human scan on the left is rigged by Pinocchio and is posed on the right by changing joint angles in the embedded skele- ton. The well-known deficiencies of LBS can be seen in the right Estimated shape j diagonal matrix with Hjj being the heat contribution weight of knee and hip areas. nearest bone to vertex j. Because ∆ has units of length−2 , so t H. Letting d(j) be the distance from vertex j to the nearest e, Pinocchio uses Hjj = c/d(j)2 if the shortest line segment 5.1 Generality m the vertex to the bone is contained in the character volume Figure 9 shows our 16 test characters and the skeletons Pinocchio Hjj = 0 if it is not. It uses the precomputed distance field to embedded. The skeleton was correctly embedded into 13 of these classifier rmine whether a line segment is entirely contained in the char- models (81% success). For Models 7, 10 and 13, a hint for a single r volume. For c ≈ 0.22, this method gives weights with similar joint was sufficient to produce a good embedding. sitions to those computed by finding the equilibrium over the These tests demonstrate the range of proportions that our method me. Pinocchio uses c = 1 (corresponding to anisotropic heat can tolerate: we have a well-proportioned human (Models 1–4, 8), usion) because the results look more natural. When k bones are large arms and tiny legs (6; in 10, this causes problems), and large distant from vertex j, heat contributions from all of them are legs and small arms (15; in 13, the small arms cause problems). For d: pj is 1/k for all of them, and Hjj = kc/d(j)2 . other characters we tested, skeletons were almost always correctly quation (1) is a sparse linear system, and the left hand side embedded into well-proportioned characters whose pose matched Estimated rix −∆ + H does not depend on i, the bone we are interested the given skeleton. Pinocchio was even able to transfer a biped Thus we can factor the system once and back-substitute to find walk onto a human hand, a cat on its hind legs, and a donut. weights for each bone. Botsch et al. [2005] show how to use The most common issues we ran into on other characters were: arse Cholesky solver to compute the factorization for this kind ystem. Pinocchio uses the TAUCS [Toledo 2003] library for computation. Note also that the weights wi sum to 1 for each reflectance • The thinnest limb into which we may hope to embed a bone has a radius of 2τ . Characters with extremely thin limbs often reflectance fail because the the graph we extract is disconnected. Reduc- ex: if we sum (1) over i, we get (−∆ + H) i wi = H · 1, P ing τ , however, hurts performance. ch yields i wi = 1. P is possible to speed up this method slightly by finding vertices • Degree 2 joints such as knees and elbows are often positioned are unambiguously attached to a single bone and forcing their incorrectly within a limb. We do not know of a reliable way ght to 1. An earlier variant of our algorithm did this, but the im- to identify the right locations for them: on some characters ement was negligible, and this introduced occasional artifacts. they are thicker than the rest of the limb, and on others they are thinner. Results Although most of our tests were done with the biped skeleton, evaluate Pinocchio with respect to the three criteria stated in we have also used other skeletons for other characters (Figure 10). introduction: generality, quality, and performance. To ensure bjective evaluation, we use inputs that were not used during 5.2 Quality elopment. To this end, once the development was complete, we Figure 11 shows the results of manually posing a human scan us- ed Pinocchio on 16 biped Cosmic Blobs models that we had not ing our attachment. Our video [Baran and Popovi´ 2007b] demon- c iously tried. strates the quality of the animation produced by Pinocchio. 6
  • 15. Lead: Jitendra Malik (UC Berkeley) Structured object reports Participants: Deva Ramanan (UC Irvine), Steve Seitz (U Washington duction/goal: Human detection and pose estimation are tasks with many applicat ng next-generation human-computer interfaces and activity understanding. Detection “If you’re not winning the game, change the rules” s a classification problem (does this window contain a person or not?), while pose es en cast as a regression problem, where given an image or sequence of frames, one m oint angles. This project will take a more general view and cast both tasks as one of “p e a full syntactic parse will report the number of people present (if any), their body
  • 16. Lead: J Caveat: we need more pixels Rama Participants: Deva Multiresolution models for object d Dennis Park Deva Ramanan Charless Fowlkes Motivation & Goal S3. Now we re Objects in images come with various resolutions. star model Most recognition systems are scale-invariant, eliminate bl i.e. fixed-size template LR global tem More pixels mean more information! naturally fits We want to use the information when it is avail- LR template able. HR templat Test image trained by La Goal : part locatio 1. We want to use more pixels. 2. We want to detect small instances as well. 3. In addition, we try to address the correlation be- Φ(x, s, z) = tween resolution and the role of context. Introduction/goal: Human scoring funct We should focus on high-resolution data Model detect cluding next-generation human-com= (in contrast to most learning methods) Building blocks f (x, s) HOG features [1] SVM cast as a classification problem &(does S4. final mod The boundar
  • 17. Caltech Pedestrian Benchmark missed 10 d detections detections Multiresolution model , we show the result of our low-resolution rigid-template baseline. Park et al. 2010 s to detect large instances. On the right, we show detections of , part-based baseline, which fails to find small instances. On the detections of our multiresolution model that is able to detect both tances. The threshold of each model is set todecrease same rate of Multiresolution representations yield the error by 2X compared to previous work
  • 18. How do we move beyond the plateau? 1. Develop more structured models with less invariant features 2. Score syntax as semantics 3. Generate ground-truth datasets of structured labels
  • 19. Case study: small or big parts Skeleton Parts/Poselets Mini-parts
  • 20. What are good representations? Exemplars Parts Attributes Visual Phrases Grammars ?
  • 21. Even worse: what are the parts (if any)? Is there any structure to label here?
  • 23. Selective parameter sharing v v v Exemplars => Parts => Attributes => Grammars Multi-task training of instance-specific classifiers
  • 25. How do we move beyond the plateau? 1. Develop more structured models with less invariant features 2. Score “nuisance” variables as meaningful output 3. Generate ground-truth datasets of structured labels
  • 26. Diagram for Eero Machine Learning Vision Vision as applied machine learning
  • 27. Diagram for Eero Vision Graphics Machine Learning (shape & appearance) Vision as structured pattern recognition