SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
The	
  computa,onal	
  magic	
  of	
  the	
  
                ventral	
  stream
                                                  tomaso poggio
                                                  McGovern Institute, I2, CBCL,BCS, CSAIL
                                                  MIT
                                                  (with Jim Mutch, Joel Leibo, Lorenzo Rosasco)



                                                  A	
  dream:	
  
                                                  I	
  have	
  a	
  theory	
  of	
  what	
  the	
  
                                                  ventral	
  stream	
  does	
  and	
  how;	
  it	
  
                                                  can	
  explain	
  why	
  evolu:on	
  chose	
  
                                                  a	
  hierarchical	
  architecture	
  and	
  
                                                  how	
  computa:ons	
  determine	
  
                                                  proper:es	
  of	
  cells	
  specific	
  for	
  
                                                  each	
  visual	
  area.
          Very preliminary, provocative, probably completely wrong theory, but if true...
                    what else do we want from computational neuroscience?
                                                  	
  
Monday, August 29, 2011
Machine	
  Learning	
  +	
  Vision	
  @CBCL	
  

                                                           Poggio, T. and F. Girosi.        Networks for
                                                           Approximation and Learning, Proceedings of the
                                                           IEEE 1990) also Science, 1990
                                     LEARNING THEORY
                                            +                    Mathematics
                                                           Poggio, T. and S.Smale.
                                                                                       , Notices American
                                                           Mathematical Society (AMS), 2003
                                       ALGORITHMS          Poggio, T., R. Rifkin, S. Mukherjee and P.
                                                           Niyogi. General Conditions for Predictivity in
                                                           Learning Theory, Nature, 2004




                                                          Beymer, D. and T. Poggio.
                                                                        , Science, 272., 1905-1909, 1996
                                                           Brunelli, R. and T. Poggio. Face Recognition:


                                                               Engineering
                                                           Features Versus Templates, IEEE PAMI, 1993


                                       ENGINEERING
                                       APPLICATIONS
                                                           Ezzat, T., G. Geiger and T. Poggio. “Trainable
                                                           Vi d e o r e a l i s t i c S p e e c h A n i m a t i o n , ” A C M
                                                           SIGGRAPH 2002


                                                          Freedman, D.J., M. Riesenhuber, T. Poggio and E.K.
                                                          Miller.
                                                                          , Science, 291, 312-316, 2001.
                                                          Riesenhuber, M. and T. Poggio.
                                      COMPUTATIONAL
                                      NEUROSCIENCE:
                                     models+experiments
                                                                Science
                                                          Neuroscience, 2, 1019-1025, 1999.
                                                           Serre, T., A. Oliva and T. Poggio.
                                                                                                               , Nature



                                                                                                                           ,
                                                           (PNAS), Vol. 104, No. 15, 6424-6429, 2007.
                                                           Poggio, T. and E. Bizzi.
                                                                        , Nature, Vol. 431, 768-774, 2004.

Monday, August 29, 2011
Vision:	
  
                 what	
  is	
  where

                                    	
  



Vision
A Computational Investigation into the Human Representation and Processing of Visual Information
David Marr
Foreword by Shimon Ullman
Afterword by Tomaso Poggio

David Marr's posthumously published Vision (1982) influenced a generation of brain and cognitive scientists, inspiring many to
enter the field. In Vision, Marr describes a general framework for understanding visual perception and touches on broader
questions about how the brain and its functions can be studied and understood. Researchers from a range of brain and
cognitive sciences have long valued Marr's creativity, intellectual power, and ability to integrate insights and data from
neuroscience, psychology, and computation. This MIT Press edition makes Marr's influential work available to a new generation
of students and scientists.

In Marr's framework, the process of vision constructs a set of representations, starting from a description of the input image
and culminating with a description of three-dimensional objects in the surrounding environment. A central theme, and one that
has had far-reaching influence in both neuroscience and cognitive science, is the notion of different levels of analysis—in Marr's
framework, the computational level, the algorithmic level, and the hardware implementation level.

Now, thirty years later, the main problems that occupied Marr remain fundamental open problems in the study of perception.
Vision provides inspiration for the continui


Monday, August 29, 2011
The	
  ventral	
  stream




                           Movshon et al.




                              ventral
                             stream



                           Desimone & Ungerleider 1989

Monday, August 29, 2011
 	
  Recogni9on	
  in	
  the	
  Ventral	
  Stream:	
  
 classical,	
  “standard”	
  model	
  (of	
  immediate	
  recogni9on)
                                 *Modified from (Gross, 1998)




          [software available online                        Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu
            with CNS (for GPUs)]                            Knoblich Kreiman & Poggio 2005; Serre Oliva Poggio 2007
Monday, August 29, 2011
 	
  Model	
  “works”:
            it	
  accounts	
  for	
  bits	
  of	
  physiology	
  +	
  psychophysics
                                            Hierarchical	
  Feedforward	
  Models:
                                     is	
  consistent	
  with	
  or	
  predict	
  	
  neural	
  data
                                                                            V1:
                          Simple and complex cells tuning (Schiller et al 1976; Hubel & Wiesel 1965; Devalois et al 1982)
                          MAX-like operation in subset of complex cells (Lampl et al 2004)
                                                                          V2:
                          Subunits and their tuning (Anzai, Peng, Van Essen 2007)
                                                                            V4:
                          Tuning for two-bar stimuli (Reynolds Chelazzi & Desimone 1999)
                          MAX-like operation (Gawne et al 2002)
                          Two-spot interaction (Freiwald et al 2005)
                          Tuning for boundary conformation (Pasupathy & Connor 2001, Cadieu, Kouh, Connor et al., 2007)
                          Tuning for Cartesian and non-Cartesian gratings (Gallant et al 1996)
                                                                            IT:
                          Tuning and invariance properties (Logothetis et al 1995, paperclip objects)
                          Differential role of IT and PFC in categorization (Freedman et al 2001, 2002, 2003)
                          Read out results (Hung Kreiman Poggio & DiCarlo 2005)
                          Pseudo-average effect in IT (Zoccolan Cox & DiCarlo 2005; Zoccolan Kouh Poggio & DiCarlo 2007)
                                                                          Human:
                          Rapid categorization (Serre Oliva Poggio 2007)
                          Face processing (fMRI + psychophysics) (Riesenhuber et al 2004; Jiang et al 2006)

Monday, August 29, 2011
 	
  Model	
  “works”:
                      it	
  performs	
  well	
  at	
  computa9onal	
  level

                                              Models of the ventral stream in cortex
                                                   perform well compared to
                                          engineered computer vision systems (in 2006)
                                                     on several databases




                                                               Bileschi, Wolf, Serre, Poggio, 2007
Monday, August 29, 2011
 	
  Model	
  “works”:
                      it	
  performs	
  well	
  at	
  computa9onal	
  level

                                              Models of the ventral stream in cortex
                                                   perform well compared to
                                          engineered computer vision systems (in 2006)
                                                     on several databases




                                                               Bileschi, Wolf, Serre, Poggio, 2007
Monday, August 29, 2011
 	
  Model	
  “works”:
                      it	
  performs	
  well	
  at	
  computa9onal	
  level

                                              Models of the ventral stream in cortex
                                                   perform well compared to
                                          engineered computer vision systems (in 2006)
                                                     on several databases




                                                               Bileschi, Wolf, Serre, Poggio, 2007
Monday, August 29, 2011
 	
  Recogni9on	
  in	
  Visual	
  Cortex:	
  
                 computa9on	
  and	
  mathema9cal	
  theory

                                                 For 10years+...

                                     I did not manage to understand how
                                                model works....


                                    we need a theory -- not only a model!




Monday, August 29, 2011
Why do hierarchical architectures
                      work?




          If	
  features	
  do	
  not	
  ma.er....what	
  does?	
  




Monday, August 29, 2011
Monday, August 29, 2011
 
                 What	
  is	
  the	
  main	
  difficulty	
  of	
  object	
  recogni7on?
                Geometric	
  transforma7ons	
  or	
  intraclass	
  variability?

    •      We have “factored out” viewpoint, scale, translation: is it
           now still “difficult” for a classifier to categorize dogs vs
           horses?




                                                                                        Leibo, 2011
Monday, August 29, 2011
Are	
  transforma7ons	
  the	
  main	
  difficulty
                  for	
  biological	
  object	
  recogni7on?




                                                                  Leibo, 2011
Monday, August 29, 2011
6 main ideas, steps in the theory
1. The computational goal of the ventral stream is to learn
   transformations (affine in R^2) during development and be
   invariant to them (McCulloch+Pitts 1945, Hoffman 1966, Jack
   Gallant, 1993,...)
2. A memory-based module, similar to simple-complex cells, can
   learn from a video of an image transforming (eg translating) to
   provide a signature to any single image of any object that is
   invariant under the transformation (Invariance Lemma)
3. Since the group Aff(2,R) can be factorized as a semidirect
   product of translations and GL(2, R), storage requirements of
   the memory-based module can be reduced by order of
   magnitudes (Factorization Theorem). I argue that this is the
   evolutionary reason for hierarchical architecture in the ventral
   stream.


Monday, August 29, 2011
6 main ideas, steps in the theory
4. Evolution selects translations by using small apertures in the
   first layer (the Stratification Theorem). Size of receptive fields in
   the sequence of ventral stream area automatically selects
   specific transformations (translations, scaling, shear and
   rotations).
5. If Hebb-like rule active at at synapses then the storage
   requirements can be reduced further: tuning of cell in each
   layer -- during development -- will mimic the spectrum (SVD) of
   stored templates (Linking Theorem).
6. At the top layers invariances learned for “places” and for class-
   specific transformations such changes of expression of a face
   or rotation in depth of a face or different poses of a body ===>
   class-specific patches



Monday, August 29, 2011
Stratification




Monday, August 29, 2011
SVD	
  (of	
  templatebook)	
  for	
  (x)	
  transla=ons	
  from
                     natural	
  images	
  




                                                        Jim	
  Mutch
Monday, August 29, 2011
SVD	
  (of	
  templatebook)	
  for	
  (y)	
  transla=ons	
  
                              noise




                                                              Jim	
  Mutch
Monday, August 29, 2011
These are predicted features orthogonal to trajectories of Lie group:
               for translation (x, y), expansion, rotation




                                                             V1




                                                           V2/V4?




   This is from Hoffman, 1966 cited by Jack Gallant
Monday, August 29, 2011
Implications

 • Image	
  prior,	
  shape	
  features,	
  natural	
  images	
  priors	
  are	
  “irrelevant”

 • The	
   type	
   of	
  transforma@on	
  that	
  are	
  learned	
  from	
  visual	
  experience	
  
   depend	
  on	
  the	
  size	
  (measured	
  in	
  terms	
  of	
   wavelength)	
  and	
  thus	
  on	
  
   the	
   area	
   (layer	
   in	
   the	
   models)	
   -­‐-­‐	
   assuming	
   that	
   the	
   aperture	
   size	
  
   increases	
  with	
  layers	
  	
  

 • The	
  mix	
  of	
  transforma@ons	
  learned	
  determine	
  the	
  proper@es	
  of	
  the	
           	
  
   recep@ve	
  fields	
   -­‐-­‐	
   oriented	
  bars	
   in	
  V1+V2,	
  radial	
  and	
  spiral	
  paLerns	
  
   in	
  V4	
  up	
  to	
  class	
  specific	
  tuning	
  in	
  AIT	
  (eg	
  face	
  tuned	
  cells)



Monday, August 29, 2011
Some predictions


 • Invariance to small transformations in early areas (eg translations in V1) may underly stability
 of visual perception (suggested by Stu Geman)
 • Each cell's tuning properties are shaped by visual experience of image transformations during
 developmental and adult plasticity
 • Simple cells are likely to be the same population as complex cells, arising from different
 convergence of the Hebbian learning rule. The input to complex ``complex'' cells are dendritic
 branches with simple cell properties
 • Class-specific transformations are learned and represented at the top of the ventral stream
 hierarchy; thus class-specific modules -- such as faces, places and possibly body areas -- should
 exist in IT
 • The type of transformations that are learned from visual experience depend on the size of the
 receptive fields and thus on the area (layer in the models) -- assuming that the size increases
 with layers
 • The mix of transformations learned in each area influences the tuning properties of the cells --
 oriented bars in V1+V2, radial and spiral patterns in V4 up to class specific tuning in AIT (eg face
  tuned cells).




Monday, August 29, 2011
Collaborators	
  in	
  recent	
  work



                                          J.	
  Leibo,	
  J.	
  Mutch,	
  L.	
  Isik,	
  
                                   S.	
  Ullman,	
  S.	
  Smale,	
  L.	
  Rosasco,	
  
                 H.	
  Jhuang,	
  C.	
  Tan,	
  	
  N.	
  Edelman,	
  E.	
  Meyers,	
  B.	
  Desimone
                                                    O	
  Lewis,	
  S	
  Voinea


     Also:	
  	
  M.	
  Riesenhuber,	
  T.	
  Serre,	
  S.	
  Chikkerur,	
  A.	
  Wibisono,	
  J.	
  Bouvrie,	
  M.	
  Kouh,	
  	
  	
  J.	
  DiCarlo,	
  E.	
  
     Miller,	
  	
  C.	
  Cadieu,	
  A.	
  Oliva,	
  C.	
  Koch,	
  	
  A.	
  CaponneLo	
  ,D.	
  	
  Walther,	
  	
  	
  U.	
  Knoblich,	
  	
  T.	
  Masquelier,	
  
     S.	
  Bileschi,	
  	
  L.	
  Wolf,	
  E.	
  Connor,	
  D.	
  Ferster,	
  I.	
  Lampl,	
  S.	
  Chikkerur,	
  G.	
  Kreiman,	
  N.	
  Logothe@s




                                                                                                                                                  Jim	
  Mutch
Monday, August 29, 2011

Contenu connexe

En vedette

05 structured prediction and energy minimization part 2
05 structured prediction and energy minimization part 205 structured prediction and energy minimization part 2
05 structured prediction and energy minimization part 2zukun
 
Describing People: A Poselet-based approach to attribute classification
Describing People: A Poselet-based approach to attribute classificationDescribing People: A Poselet-based approach to attribute classification
Describing People: A Poselet-based approach to attribute classificationzukun
 
Cvpr2007 object category recognition p2 - part based models
Cvpr2007 object category recognition   p2 - part based modelsCvpr2007 object category recognition   p2 - part based models
Cvpr2007 object category recognition p2 - part based modelszukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Fcv appli science_fergus
Fcv appli science_fergusFcv appli science_fergus
Fcv appli science_ferguszukun
 
Fcv taxo zisserman
Fcv taxo zissermanFcv taxo zisserman
Fcv taxo zissermanzukun
 
ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...
ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...
ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...zukun
 
Fcv scene hebert
Fcv scene hebertFcv scene hebert
Fcv scene hebertzukun
 
CS221: HMM and Particle Filters
CS221: HMM and Particle FiltersCS221: HMM and Particle Filters
CS221: HMM and Particle Filterszukun
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3zukun
 
Mit6870 orsu lecture12
Mit6870 orsu lecture12Mit6870 orsu lecture12
Mit6870 orsu lecture12zukun
 
3D visualization with VTVK and MayaVi2
3D visualization with VTVK and MayaVi23D visualization with VTVK and MayaVi2
3D visualization with VTVK and MayaVi2zukun
 
Formation sds
Formation sdsFormation sds
Formation sdssolarlog
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...zukun
 
Histogram of oriented gradients for human detection
Histogram of oriented gradients for human detectionHistogram of oriented gradients for human detection
Histogram of oriented gradients for human detectionzukun
 
05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer visionzukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
Matching with Invariant Features
Matching with Invariant FeaturesMatching with Invariant Features
Matching with Invariant Featureszukun
 
Cvpr2007 object category recognition p1 - bag of words models
Cvpr2007 object category recognition   p1 - bag of words modelsCvpr2007 object category recognition   p1 - bag of words models
Cvpr2007 object category recognition p1 - bag of words modelszukun
 

En vedette (20)

05 structured prediction and energy minimization part 2
05 structured prediction and energy minimization part 205 structured prediction and energy minimization part 2
05 structured prediction and energy minimization part 2
 
Describing People: A Poselet-based approach to attribute classification
Describing People: A Poselet-based approach to attribute classificationDescribing People: A Poselet-based approach to attribute classification
Describing People: A Poselet-based approach to attribute classification
 
Cvpr2007 object category recognition p2 - part based models
Cvpr2007 object category recognition   p2 - part based modelsCvpr2007 object category recognition   p2 - part based models
Cvpr2007 object category recognition p2 - part based models
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Fcv appli science_fergus
Fcv appli science_fergusFcv appli science_fergus
Fcv appli science_fergus
 
Fcv taxo zisserman
Fcv taxo zissermanFcv taxo zisserman
Fcv taxo zisserman
 
ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...
ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...
ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...
 
Fcv scene hebert
Fcv scene hebertFcv scene hebert
Fcv scene hebert
 
CS221: HMM and Particle Filters
CS221: HMM and Particle FiltersCS221: HMM and Particle Filters
CS221: HMM and Particle Filters
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3
 
Mit6870 orsu lecture12
Mit6870 orsu lecture12Mit6870 orsu lecture12
Mit6870 orsu lecture12
 
3D visualization with VTVK and MayaVi2
3D visualization with VTVK and MayaVi23D visualization with VTVK and MayaVi2
3D visualization with VTVK and MayaVi2
 
Formation sds
Formation sdsFormation sds
Formation sds
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...
 
Histogram of oriented gradients for human detection
Histogram of oriented gradients for human detectionHistogram of oriented gradients for human detection
Histogram of oriented gradients for human detection
 
05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer vision
 
Portfolio Ppm
Portfolio PpmPortfolio Ppm
Portfolio Ppm
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
Matching with Invariant Features
Matching with Invariant FeaturesMatching with Invariant Features
Matching with Invariant Features
 
Cvpr2007 object category recognition p1 - bag of words models
Cvpr2007 object category recognition   p1 - bag of words modelsCvpr2007 object category recognition   p1 - bag of words models
Cvpr2007 object category recognition p1 - bag of words models
 

Similaire à Fcv bio cv_poggio

Sciences cognitives et applications
Sciences cognitives et applicationsSciences cognitives et applications
Sciences cognitives et applicationselena.pasquinelli
 
Evidence for a Dynamo in the Main Group Pallasite Parent Body
Evidence for a Dynamo in the Main Group Pallasite Parent BodyEvidence for a Dynamo in the Main Group Pallasite Parent Body
Evidence for a Dynamo in the Main Group Pallasite Parent BodyCarlos Bella
 
Media and Bodies - Reconciving technological determinism
Media and Bodies -  Reconciving technological determinismMedia and Bodies -  Reconciving technological determinism
Media and Bodies - Reconciving technological determinismFrancesco Parisi
 
Destruction of lovejoy_comet
Destruction of lovejoy_cometDestruction of lovejoy_comet
Destruction of lovejoy_cometSérgio Sacani
 
Pedagogical patterns for learning programming by mistakes (presentation) (1)
Pedagogical patterns for learning programming by mistakes (presentation) (1)Pedagogical patterns for learning programming by mistakes (presentation) (1)
Pedagogical patterns for learning programming by mistakes (presentation) (1)Ljubomir Jerinic
 
Ultrasonic Histology
Ultrasonic HistologyUltrasonic Histology
Ultrasonic HistologyDebdoot Sheet
 
tpcv-current-03-18-0..
tpcv-current-03-18-0..tpcv-current-03-18-0..
tpcv-current-03-18-0..butest
 
Can Generative Adversarial Networks Model Imaging Physics?
Can Generative Adversarial Networks Model Imaging Physics?Can Generative Adversarial Networks Model Imaging Physics?
Can Generative Adversarial Networks Model Imaging Physics?Debdoot Sheet
 
From neuronal information flow to connectomics
From neuronal information flow to connectomicsFrom neuronal information flow to connectomics
From neuronal information flow to connectomicsmasanori shimono
 
Artificial Intelligence A Clarification of Misconceptions, Myths and Desired ...
Artificial Intelligence A Clarification of Misconceptions, Myths and Desired ...Artificial Intelligence A Clarification of Misconceptions, Myths and Desired ...
Artificial Intelligence A Clarification of Misconceptions, Myths and Desired ...Christiane Moisés
 
Science 2011-fumagalli-1245-9
Science 2011-fumagalli-1245-9Science 2011-fumagalli-1245-9
Science 2011-fumagalli-1245-9Sérgio Sacani
 
Topography of northern_hemisphere_of_mercury_from_messenger_altimeter
Topography of northern_hemisphere_of_mercury_from_messenger_altimeterTopography of northern_hemisphere_of_mercury_from_messenger_altimeter
Topography of northern_hemisphere_of_mercury_from_messenger_altimeterSérgio Sacani
 
G:\Mahdi\Research Methods In Entrepreneurship
G:\Mahdi\Research Methods In EntrepreneurshipG:\Mahdi\Research Methods In Entrepreneurship
G:\Mahdi\Research Methods In Entrepreneurshipmahdi kazemi
 
DROP - Lorenzo Cordella Thesis Project
DROP - Lorenzo Cordella Thesis Project  DROP - Lorenzo Cordella Thesis Project
DROP - Lorenzo Cordella Thesis Project Lorenzo Cordella
 
Toward a theory of chaos
Toward a theory of chaosToward a theory of chaos
Toward a theory of chaosSergio Zaina
 
Erhan_Oztop_CV2011-short
Erhan_Oztop_CV2011-shortErhan_Oztop_CV2011-short
Erhan_Oztop_CV2011-shortmuratkrty
 
Machine learning for Science and Society
Machine learning for Science and SocietyMachine learning for Science and Society
Machine learning for Science and SocietyDansk BiblioteksCenter
 
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdfpierstanislaopaolucc1
 

Similaire à Fcv bio cv_poggio (20)

Sciences cognitives et applications
Sciences cognitives et applicationsSciences cognitives et applications
Sciences cognitives et applications
 
Evidence for a Dynamo in the Main Group Pallasite Parent Body
Evidence for a Dynamo in the Main Group Pallasite Parent BodyEvidence for a Dynamo in the Main Group Pallasite Parent Body
Evidence for a Dynamo in the Main Group Pallasite Parent Body
 
Dli milano rl_parton_sep
Dli milano rl_parton_sepDli milano rl_parton_sep
Dli milano rl_parton_sep
 
Media and Bodies - Reconciving technological determinism
Media and Bodies -  Reconciving technological determinismMedia and Bodies -  Reconciving technological determinism
Media and Bodies - Reconciving technological determinism
 
Destruction of lovejoy_comet
Destruction of lovejoy_cometDestruction of lovejoy_comet
Destruction of lovejoy_comet
 
Pedagogical patterns for learning programming by mistakes (presentation) (1)
Pedagogical patterns for learning programming by mistakes (presentation) (1)Pedagogical patterns for learning programming by mistakes (presentation) (1)
Pedagogical patterns for learning programming by mistakes (presentation) (1)
 
Ultrasonic Histology
Ultrasonic HistologyUltrasonic Histology
Ultrasonic Histology
 
tpcv-current-03-18-0..
tpcv-current-03-18-0..tpcv-current-03-18-0..
tpcv-current-03-18-0..
 
Can Generative Adversarial Networks Model Imaging Physics?
Can Generative Adversarial Networks Model Imaging Physics?Can Generative Adversarial Networks Model Imaging Physics?
Can Generative Adversarial Networks Model Imaging Physics?
 
From neuronal information flow to connectomics
From neuronal information flow to connectomicsFrom neuronal information flow to connectomics
From neuronal information flow to connectomics
 
Artificial Intelligence A Clarification of Misconceptions, Myths and Desired ...
Artificial Intelligence A Clarification of Misconceptions, Myths and Desired ...Artificial Intelligence A Clarification of Misconceptions, Myths and Desired ...
Artificial Intelligence A Clarification of Misconceptions, Myths and Desired ...
 
Science 2011-fumagalli-1245-9
Science 2011-fumagalli-1245-9Science 2011-fumagalli-1245-9
Science 2011-fumagalli-1245-9
 
Empathy for the Avatar
Empathy for the AvatarEmpathy for the Avatar
Empathy for the Avatar
 
Topography of northern_hemisphere_of_mercury_from_messenger_altimeter
Topography of northern_hemisphere_of_mercury_from_messenger_altimeterTopography of northern_hemisphere_of_mercury_from_messenger_altimeter
Topography of northern_hemisphere_of_mercury_from_messenger_altimeter
 
G:\Mahdi\Research Methods In Entrepreneurship
G:\Mahdi\Research Methods In EntrepreneurshipG:\Mahdi\Research Methods In Entrepreneurship
G:\Mahdi\Research Methods In Entrepreneurship
 
DROP - Lorenzo Cordella Thesis Project
DROP - Lorenzo Cordella Thesis Project  DROP - Lorenzo Cordella Thesis Project
DROP - Lorenzo Cordella Thesis Project
 
Toward a theory of chaos
Toward a theory of chaosToward a theory of chaos
Toward a theory of chaos
 
Erhan_Oztop_CV2011-short
Erhan_Oztop_CV2011-shortErhan_Oztop_CV2011-short
Erhan_Oztop_CV2011-short
 
Machine learning for Science and Society
Machine learning for Science and SocietyMachine learning for Science and Society
Machine learning for Science and Society
 
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
2023-1113e-INFN-Seminari-Paolucci-BioInspiredSpikingLearningSleepCycles.pdf
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
ECCV2010: Modeling Temporal Structure of Decomposable Motion Segments for Act...
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Fcv bio cv_poggio

  • 1. The  computa,onal  magic  of  the   ventral  stream tomaso poggio McGovern Institute, I2, CBCL,BCS, CSAIL MIT (with Jim Mutch, Joel Leibo, Lorenzo Rosasco) A  dream:   I  have  a  theory  of  what  the   ventral  stream  does  and  how;  it   can  explain  why  evolu:on  chose   a  hierarchical  architecture  and   how  computa:ons  determine   proper:es  of  cells  specific  for   each  visual  area. Very preliminary, provocative, probably completely wrong theory, but if true... what else do we want from computational neuroscience?   Monday, August 29, 2011
  • 2. Machine  Learning  +  Vision  @CBCL   Poggio, T. and F. Girosi. Networks for Approximation and Learning, Proceedings of the IEEE 1990) also Science, 1990 LEARNING THEORY + Mathematics Poggio, T. and S.Smale. , Notices American Mathematical Society (AMS), 2003 ALGORITHMS Poggio, T., R. Rifkin, S. Mukherjee and P. Niyogi. General Conditions for Predictivity in Learning Theory, Nature, 2004 Beymer, D. and T. Poggio. , Science, 272., 1905-1909, 1996 Brunelli, R. and T. Poggio. Face Recognition: Engineering Features Versus Templates, IEEE PAMI, 1993 ENGINEERING APPLICATIONS Ezzat, T., G. Geiger and T. Poggio. “Trainable Vi d e o r e a l i s t i c S p e e c h A n i m a t i o n , ” A C M SIGGRAPH 2002 Freedman, D.J., M. Riesenhuber, T. Poggio and E.K. Miller. , Science, 291, 312-316, 2001. Riesenhuber, M. and T. Poggio. COMPUTATIONAL NEUROSCIENCE: models+experiments Science Neuroscience, 2, 1019-1025, 1999. Serre, T., A. Oliva and T. Poggio. , Nature , (PNAS), Vol. 104, No. 15, 6424-6429, 2007. Poggio, T. and E. Bizzi. , Nature, Vol. 431, 768-774, 2004. Monday, August 29, 2011
  • 3. Vision:   what  is  where   Vision A Computational Investigation into the Human Representation and Processing of Visual Information David Marr Foreword by Shimon Ullman Afterword by Tomaso Poggio David Marr's posthumously published Vision (1982) influenced a generation of brain and cognitive scientists, inspiring many to enter the field. In Vision, Marr describes a general framework for understanding visual perception and touches on broader questions about how the brain and its functions can be studied and understood. Researchers from a range of brain and cognitive sciences have long valued Marr's creativity, intellectual power, and ability to integrate insights and data from neuroscience, psychology, and computation. This MIT Press edition makes Marr's influential work available to a new generation of students and scientists. In Marr's framework, the process of vision constructs a set of representations, starting from a description of the input image and culminating with a description of three-dimensional objects in the surrounding environment. A central theme, and one that has had far-reaching influence in both neuroscience and cognitive science, is the notion of different levels of analysis—in Marr's framework, the computational level, the algorithmic level, and the hardware implementation level. Now, thirty years later, the main problems that occupied Marr remain fundamental open problems in the study of perception. Vision provides inspiration for the continui Monday, August 29, 2011
  • 4. The  ventral  stream Movshon et al. ventral stream Desimone & Ungerleider 1989 Monday, August 29, 2011
  • 5.    Recogni9on  in  the  Ventral  Stream:   classical,  “standard”  model  (of  immediate  recogni9on) *Modified from (Gross, 1998) [software available online Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu with CNS (for GPUs)] Knoblich Kreiman & Poggio 2005; Serre Oliva Poggio 2007 Monday, August 29, 2011
  • 6.    Model  “works”: it  accounts  for  bits  of  physiology  +  psychophysics Hierarchical  Feedforward  Models: is  consistent  with  or  predict    neural  data V1: Simple and complex cells tuning (Schiller et al 1976; Hubel & Wiesel 1965; Devalois et al 1982) MAX-like operation in subset of complex cells (Lampl et al 2004) V2: Subunits and their tuning (Anzai, Peng, Van Essen 2007) V4: Tuning for two-bar stimuli (Reynolds Chelazzi & Desimone 1999) MAX-like operation (Gawne et al 2002) Two-spot interaction (Freiwald et al 2005) Tuning for boundary conformation (Pasupathy & Connor 2001, Cadieu, Kouh, Connor et al., 2007) Tuning for Cartesian and non-Cartesian gratings (Gallant et al 1996) IT: Tuning and invariance properties (Logothetis et al 1995, paperclip objects) Differential role of IT and PFC in categorization (Freedman et al 2001, 2002, 2003) Read out results (Hung Kreiman Poggio & DiCarlo 2005) Pseudo-average effect in IT (Zoccolan Cox & DiCarlo 2005; Zoccolan Kouh Poggio & DiCarlo 2007) Human: Rapid categorization (Serre Oliva Poggio 2007) Face processing (fMRI + psychophysics) (Riesenhuber et al 2004; Jiang et al 2006) Monday, August 29, 2011
  • 7.    Model  “works”: it  performs  well  at  computa9onal  level Models of the ventral stream in cortex perform well compared to engineered computer vision systems (in 2006) on several databases Bileschi, Wolf, Serre, Poggio, 2007 Monday, August 29, 2011
  • 8.    Model  “works”: it  performs  well  at  computa9onal  level Models of the ventral stream in cortex perform well compared to engineered computer vision systems (in 2006) on several databases Bileschi, Wolf, Serre, Poggio, 2007 Monday, August 29, 2011
  • 9.    Model  “works”: it  performs  well  at  computa9onal  level Models of the ventral stream in cortex perform well compared to engineered computer vision systems (in 2006) on several databases Bileschi, Wolf, Serre, Poggio, 2007 Monday, August 29, 2011
  • 10.    Recogni9on  in  Visual  Cortex:   computa9on  and  mathema9cal  theory For 10years+... I did not manage to understand how model works.... we need a theory -- not only a model! Monday, August 29, 2011
  • 11. Why do hierarchical architectures work? If  features  do  not  ma.er....what  does?   Monday, August 29, 2011
  • 13.   What  is  the  main  difficulty  of  object  recogni7on? Geometric  transforma7ons  or  intraclass  variability? • We have “factored out” viewpoint, scale, translation: is it now still “difficult” for a classifier to categorize dogs vs horses? Leibo, 2011 Monday, August 29, 2011
  • 14. Are  transforma7ons  the  main  difficulty for  biological  object  recogni7on? Leibo, 2011 Monday, August 29, 2011
  • 15. 6 main ideas, steps in the theory 1. The computational goal of the ventral stream is to learn transformations (affine in R^2) during development and be invariant to them (McCulloch+Pitts 1945, Hoffman 1966, Jack Gallant, 1993,...) 2. A memory-based module, similar to simple-complex cells, can learn from a video of an image transforming (eg translating) to provide a signature to any single image of any object that is invariant under the transformation (Invariance Lemma) 3. Since the group Aff(2,R) can be factorized as a semidirect product of translations and GL(2, R), storage requirements of the memory-based module can be reduced by order of magnitudes (Factorization Theorem). I argue that this is the evolutionary reason for hierarchical architecture in the ventral stream. Monday, August 29, 2011
  • 16. 6 main ideas, steps in the theory 4. Evolution selects translations by using small apertures in the first layer (the Stratification Theorem). Size of receptive fields in the sequence of ventral stream area automatically selects specific transformations (translations, scaling, shear and rotations). 5. If Hebb-like rule active at at synapses then the storage requirements can be reduced further: tuning of cell in each layer -- during development -- will mimic the spectrum (SVD) of stored templates (Linking Theorem). 6. At the top layers invariances learned for “places” and for class- specific transformations such changes of expression of a face or rotation in depth of a face or different poses of a body ===> class-specific patches Monday, August 29, 2011
  • 18. SVD  (of  templatebook)  for  (x)  transla=ons  from natural  images   Jim  Mutch Monday, August 29, 2011
  • 19. SVD  (of  templatebook)  for  (y)  transla=ons   noise Jim  Mutch Monday, August 29, 2011
  • 20. These are predicted features orthogonal to trajectories of Lie group: for translation (x, y), expansion, rotation V1 V2/V4? This is from Hoffman, 1966 cited by Jack Gallant Monday, August 29, 2011
  • 21. Implications • Image  prior,  shape  features,  natural  images  priors  are  “irrelevant” • The   type   of  transforma@on  that  are  learned  from  visual  experience   depend  on  the  size  (measured  in  terms  of   wavelength)  and  thus  on   the   area   (layer   in   the   models)   -­‐-­‐   assuming   that   the   aperture   size   increases  with  layers     • The  mix  of  transforma@ons  learned  determine  the  proper@es  of  the     recep@ve  fields   -­‐-­‐   oriented  bars   in  V1+V2,  radial  and  spiral  paLerns   in  V4  up  to  class  specific  tuning  in  AIT  (eg  face  tuned  cells) Monday, August 29, 2011
  • 22. Some predictions • Invariance to small transformations in early areas (eg translations in V1) may underly stability of visual perception (suggested by Stu Geman) • Each cell's tuning properties are shaped by visual experience of image transformations during developmental and adult plasticity • Simple cells are likely to be the same population as complex cells, arising from different convergence of the Hebbian learning rule. The input to complex ``complex'' cells are dendritic branches with simple cell properties • Class-specific transformations are learned and represented at the top of the ventral stream hierarchy; thus class-specific modules -- such as faces, places and possibly body areas -- should exist in IT • The type of transformations that are learned from visual experience depend on the size of the receptive fields and thus on the area (layer in the models) -- assuming that the size increases with layers • The mix of transformations learned in each area influences the tuning properties of the cells -- oriented bars in V1+V2, radial and spiral patterns in V4 up to class specific tuning in AIT (eg face tuned cells). Monday, August 29, 2011
  • 23. Collaborators  in  recent  work J.  Leibo,  J.  Mutch,  L.  Isik,   S.  Ullman,  S.  Smale,  L.  Rosasco,   H.  Jhuang,  C.  Tan,    N.  Edelman,  E.  Meyers,  B.  Desimone O  Lewis,  S  Voinea Also:    M.  Riesenhuber,  T.  Serre,  S.  Chikkerur,  A.  Wibisono,  J.  Bouvrie,  M.  Kouh,      J.  DiCarlo,  E.   Miller,    C.  Cadieu,  A.  Oliva,  C.  Koch,    A.  CaponneLo  ,D.    Walther,      U.  Knoblich,    T.  Masquelier,   S.  Bileschi,    L.  Wolf,  E.  Connor,  D.  Ferster,  I.  Lampl,  S.  Chikkerur,  G.  Kreiman,  N.  Logothe@s Jim  Mutch Monday, August 29, 2011