Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Fcv learn fergus

  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Fcv learn fergus

  1. 1. The Role of Learning in Vision 3.30pm: Rob Fergus 3.40pm: Andrew Ng 3.50pm: Kai Yu 4.00pm: Yann LeCun 4.10pm: Alan Yuille 4.20pm: Deva Ramanan 4.30pm: Erik Learned-Miller 4.40pm: Erik Sudderth 4.50pm: Spotlights - Qiang Ji, M-H Yang 4.55pm: Discussion 5.30pm: End Feature / Deep Learning Compositional Models Learning Representations Overview Low-level Representations Learning on the fly
  2. 2. An Overview of Hierarchical Feature Learning and Relations to Other Models Rob Fergus Dept. of Computer Science, Courant Institute, New York University
  3. 3. Motivation <ul><li>Multitude of hand-designed features currently in use </li></ul><ul><ul><li>SIFT, HOG, LBP, MSER, Color-SIFT…………. </li></ul></ul><ul><li>Maybe some way of learning the features? </li></ul><ul><li>Also, just capture low-level edge gradients </li></ul>Felzenszwalb, Girshick, McAllester and Ramanan, PAMI 2007 Yan & Huang (Winner of PASCAL 2010 classification competition)
  4. 4. Beyond Edges? <ul><li>Mid-level cues </li></ul>“ Tokens” from Vision by D.Marr: Continuation Parallelism Junctions Corners <ul><li>High-level object parts: </li></ul><ul><li>Difficult to hand-engineer  What about learning them? </li></ul>
  5. 5. <ul><ul><li>Build hierarchy of feature extractors (≥ 1 layers) </li></ul></ul><ul><ul><li>All the way from pixels  classifier </li></ul></ul><ul><ul><li>Homogenous structure per layer </li></ul></ul><ul><ul><li>Unsupervised training </li></ul></ul>Deep/Feature Learning Goal Layer 1 Layer 2 Layer 3 Simple Classifier Image/Video Pixels <ul><li>Numerous approaches: </li></ul><ul><ul><li>Restricted Boltzmann Machines (Hinton, Ng, Bengio,…) </li></ul></ul><ul><ul><li>Sparse coding (Yu, Fergus, LeCun) </li></ul></ul><ul><ul><li>Auto-encoders (LeCun, Bengio) </li></ul></ul><ul><ul><li>ICA variants (Ng, Cottrell) </li></ul></ul><ul><ul><li>& many more…. </li></ul></ul>
  6. 6. Single Layer Architecture Filter Normalize Pool Input: Image Pixels / Features Output: Features / Classifier Details in the boxes matter (especially in a hierarchy) Links to neuroscience
  7. 7. Example Feature Learning Architectures Pixels / Features Filter with Dictionary (patch/tiled/convolutional) Spatial/Feature (Sum or Max) Normalization between feature responses Features + Non-linearity Local Contrast Normalization (Subtractive / Divisive) (Group) Sparsity Max / Softmax
  8. 8. SIFT Descriptor <ul><li>Image Pixels </li></ul>Apply Gabor filters Spatial pool (Sum) Normalize to unit length Feature Vector
  9. 9. <ul><li>SIFT Features </li></ul>Spatial Pyramid Matching Filter with Visual Words Multi-scale spatial pool (Sum) Max Classifier Lazebnik, Schmid, Ponce [CVPR 2006]
  10. 10. Role of Normalization <ul><li>Lots of different mechanisms (max, sparsity, LCN etc.) </li></ul><ul><li>All induce local competition between features to explain input </li></ul><ul><ul><li>“ Explaining away” </li></ul></ul><ul><ul><li>Just like top-down models </li></ul></ul><ul><ul><li>But more local mechanism </li></ul></ul>Example: Convolutional Sparse Coding Filters Convolution |.| 1 |.| 1 |.| 1 |.| 1 Zeiler et al. [CVPR’10/ICCV’11], Kavakouglou et al. [NIPS’10], Yang et al. [CVPR’10]
  11. 11. Role of Pooling <ul><li>Spatial pooling </li></ul><ul><ul><li>Invariance to small transformations </li></ul></ul>Chen, Zhu, Lin, Yuille, Zhang [NIPS 2007] <ul><li>Pooling across feature groups </li></ul><ul><ul><li>Gives AND/OR type behavior </li></ul></ul><ul><ul><li>Compositional models of Zhu, Yuille </li></ul></ul><ul><ul><li>Larger receptive fields </li></ul></ul>Zeiler, Taylor, Fergus [ICCV 2011] <ul><li>Pooling with latent variables (& springs) </li></ul><ul><ul><li>Pictorial structures models </li></ul></ul>Felzenszwalb, Girshick, McAllester, Ramanan [PAMI 2009]
  12. 13. <ul><li>HOG Pyramid </li></ul>Object Detection with Discriminatively Trained Part-Based Models Apply object part filters Pool part responses (latent variables & springs) Non-max Suppression (Spatial) Score Felzenszwalb, Girshick, McAllester, Ramanan [PAMI 2009] + +