SlideShare une entreprise Scribd logo
1  sur  69
CVPR12 Tutorial on Deep
      Learning

       Sparse Coding
             Kai Yu

       yukai@baidu.com
 Department of Multimedia, Baidu
Relentless research on visual recognition

       Caltech 101     80 Million Tiny Images




                              ImageNet
    PASCAL VOC




08/22/12                                        2
The pipeline of machine visual perception

                                                        Most Efforts in
                                                       Machine Learning
                                                             Inference:
            Low-level      Pre-      Feature     Feature
                                                            prediction,
             sensing    processing   extract.   selection
                                                            recognition




              • Most critical for accuracy
              • Account for most of the computation for testing
              • Most time-consuming in development cycle
              • Often hand-craft in practice



 08/22/12                                                                 3
Computer vision features




          SIFT                   Spin image




           HoG                      RIFT


                                GLOH

                           Slide Credit: Andrew Ng
Learning features from data



                                                        Machine Learning
                                                             Inference:
            Low-level      Pre-      Feature     Feature
                                                            prediction,
             sensing    processing   extract.   selection
                                                            recognition



                              Feature Learning:
                         instead of design features,
                        let’s design feature learners



 08/22/12                                                                  5
Learning features from data via sparse coding




                                                             Inference:
            Low-level      Pre-      Feature     Feature
                                                            prediction,
             sensing    processing   extract.   selection
                                                            recognition



               Sparse coding offers an effective building
                    block to learn useful features




 08/22/12                                                                 6
Outline

  1.       Sparse coding for image classification
  2.       Understanding sparse coding
  3.       Hierarchical sparse coding
  4.       Other topics: e.g. structured model, scale-up,
           discriminative training
  5.       Summary




08/22/12                                                    7
“BoW representation + SPM” Paradigm - I




                      Bag-of-visual-words representation
                         (BoW) based on VQ coding


 08/22/12   Figure credit: Fei-Fei Li                      8
“BoW representation + SPM” Paradigm - II




                        Spatial pyramid matching:
                  pooling in different scales and locations

 08/22/12   Figure credit: Svetlana Lazebnik                  9
Image Classification using “BoW + SPM”

                 Image Classification
                     Spatial Pooling




                                                   Dense SIFT
                                       VQ Coding
    Classifier




 08/22/12                                                       10
The Architecture of “Coding + Pooling”




             Coding   Pooling   Coding   Pooling




  • •e.g.,
    e.g.,convolutional neural net, HMAX, BoW, …
          convolutional neural net, HMAX, BoW, …


                       11
“BoW+SPM” has two coding+pooling layers




      Local Gradients     Pooling   VQ Coding    Average Pooling     SVM
                                                (obtain histogram)

                e.g, SIFT, HOG




    SIFT feature itself follows a coding+pooling operation
    SIFT feature itself follows a coding+pooling operation
                                                                           12
Develop better coding methods




       Better Coding   Better Pooling   Better Coding Better Pooling    Better
                                                                       Classifier



 - Coding: nonlinear mapping data into another feature space
 - Better coding methods: sparse coding, RBMs, auto-encoders


                               13
What is sparse coding

    Sparse coding (Olshausen & Field,1996). Originally
    developed to explain early visual processing in the brain
    (edge detection).

    Training: given a set of random patches x, learning a
    dictionary of bases [Φ1, Φ2, …]

    Coding: for data vector x, solve LASSO to find the
    sparse coefficient vector a




08/22/12                                                        14
Sparse coding: training time

   Input: Images x1, x2, …, xm (eachinRd)
   Learn: Dictionary of bases φ1, φ2, …, φk (alsoRd).




 Alternating optimization:
 1.Fix dictionary φ1, φ2, …, φk , optimize a (a standard LASSO
 problem )

 2.Fix activations a, optimize dictionary φ1, φ2, …, φk (a
 convex QP problem)
Sparse coding: testing time

Input: Novel image patch x (inRd) and previously learned
  φi’s
Output: Representation [ai,1, ai,2, …, ai,κ] of image patch xi.




             ≈ 0.8 *              + 0.3 *               + 0.5 *

  Represent xi as: ai = [0, 0, …, 0, 0.8, 0, …, 0, 0.3, 0, …, 0, 0.5, …]
Sparse coding illustration
     50
                Natural Images                                                                                                                              Learned bases (φ1 , …, φ64): “Edges”
    1 00


    1 50


    2 00                            50

    2 50
                                1 00

    3 00
                                1 50

    3 50
                                2 00
    4 00
                                2 50                                        50
    4 50

                                3 00                                       100
    5 00
           50   100   1 50   2 00        250        300     3 50    4 00   450
                                                                           150    5 00
                                3 50

                                                                           200
                                4 00

                                                                           250
                                4 50

                                                                           300
                                5 00
                                               50         1 00     150     2350
                                                                             00   250     3 00   3 50   4 00    4 50      500


                                                                           400


                                                                           450


                                                                           500
                                                                                     50    100   150    200    250     3 00     350   400   450   5 00




 Test example

                                                                                  ≈ 0.8 *                                                                + 0.3 *           + 0.5 *

         x       ≈ 0, …,      φ , + 0.3             φ42 + 0.5
 [a1, …, a64] = [0, 0.8 *0, 0.8360, …, 0,0.3 *, 0, …, 0, 0.5, 0] *                                                                                                                   φ63
 (feature representation)

                                                                                                 Slide credit: Andrew Ng                                       Compact & easily interpretable
Self-taught Learning   [Raina, Lee, Battle, Packer & Ng, ICML 07]




         Motorcycles                 Not motorcycles


                                                Testing:
                                                 Testing:
                                                What isis this?
                                                 What this?


                                 …
            Unlabeled images                   Slide credit: Andrew Ng
Classification Result on Caltech 101



            9K images, 101 classes

                                     64%
                                     SIFT VQ + Nonlinear
                                     SVM



                                     ~50%
                                     Pixel Sparse Coding
                                     + Linear SVM


 08/22/12                                              19
Sparse Coding on SIFT – ScSPM algorithm
                                      [Yang, Yu, Gong & Huang, CVPR09]




      Local Gradients      Pooling     Sparse Coding   Max Pooling    Linear
                                                                     Classifier

                e.g, SIFT, HOG




                                 20
Sparse Coding on SIFT – the ScSPM algorithm
                          [Yang, Yu, Gong & Huang, CVPR09]



            Caltech-101

                                    64%
                                    SIFT VQ + Nonlinear
                                    SVM



                                    73%
                                    SIFT Sparse Coding +
                                    Linear SVM (ScSPM)

 08/22/12                                                    21
22




                                            Dense SIFT
                                                              73%

                                          Sparse Coding
Summary: Accuracies on Caltech 101




                                        Spatial Max Pooling
                                                                - Sparse coding is a better building block
                                            Linear SVM
                                       Sparse Coding




                                                              50%
                                                                - Deep models are preferred
                                     Spatial Max Pooling
                                        Linear SVM
                                            Dense SIFT




                                                               Key message:
                                                              64%
                                            VQ Coding
                                           Spatial Pooling




                                                              08/22/12
                                           Nonlinear SVM
Outline

  1.       Sparse coding for image classification
  2.       Understanding sparse coding
  3.       Hierarchical sparse coding
  4.       Other topics: e.g. structured model, scale-up,
           discriminative training
  5.       Summary




08/22/12                                                    23
Outline

  1. Sparse coding for image classification
  2. Understanding sparse coding
       –     Connections to RBMs, autoencoders, …
       –     Sparse activations vs. sparse models, ..
       –     Sparsity vs. locality
       –     local sparse coding methods
  1. Hierarchical sparse coding
  2. Other topics: e.g. structured model, scale-up,
           discriminative training
  3.       Summary

08/22/12                                                24
Classical sparse coding




-    a is sparse
-    a is often higher dimension than x
-    Activation a=f(x) is nonlinear implicit function of x
-    reconstruction x’=g(a) is linear & explicit

                a
                    f(x) encoding      x’              decoding


                                                   a
                                            g(a)
                x
08/22/12                                                      25
RBM & autoencoders


-    also involve activation and reconstruction
-    but have explicit f(x)
-    not necessarily enforce sparsity on a
-    but if put sparsity on a, often get improved results
     [e.g. sparse RBM, Lee et al. NIPS08]

               a



                                      x’
                   f(x) encoding                      decoding




                                                  a
                                           g(a)
               x



08/22/12                                                         26
Sparse coding: A broader view

     Any feature mapping from x to a, i.e. a = f(x),
     where
     -a is sparse (and often higher dim. than x)
     -f(x) is nonlinear
     -reconstruction x’=g(a) , such that x’≈x
               a




                                     x’
                   f(x)




                                                 a
                                          g(a)
               x

  Therefore, sparse RBMs, sparse auto-encoder, even VQ
  can be viewed as a form of sparse coding.
08/22/12                                                 27
Outline
  1. Sparse coding for image classification
  2. Understanding sparse coding
       –     Connections to RBMs, autoencoders, …
       –     Sparse activations vs. sparse models, …
       –     Sparsity vs. locality
       –     local sparse coding methods
  1. Hierarchical sparse coding
  2. Other topics: e.g. structured model, scale-up,
           discriminative training
  3.       Summary


08/22/12                                               28
Sparse activations vs. sparse models

For a general function learning problem a = f(x):
1. sparse model: f(x)’s parameters are sparse
    - example: LASSO f(x)=<w,x>, w is sparse
    - the goal is feature selection: all data selects a common
      subset of features
    - hot topic in machine learning


2. sparse activations: f(x)’s outputs are sparse
    - example: sparse coding a=f(x), a is sparse
    - the goal is feature learning: different data points activate
      different feature subsets

 08/22/12                                                            29
Example of sparse models

                                                    f(x)=<w,x>, where
                                                   w=[0, 0.2, 0, 0.1, 0, 0]




                                                                              ]
                                                                              [ 0 | 0 | 0 0


                                                                                                  ]
                                                                                                  [ 0 | 0 | 0 0
                                                                                                  ]
                                                                                                  [ 0 | 0 | 0 0
| ]
xm [ | | | | |




                     | ][ | | | | |
                                  x1 [ | | | | |
                     x2
                     | ]




                                                                                    [ 0 | 0 | 0 0
                                                                                              …
   x3 [ | | | | |
   | ]
                 …                                                                  ]



                 • because the 2nd and 4th elements of w are non-zero,
                   these are the two selected features in x
                 • globally-aligned sparse representation
      08/22/12                                                                                                    30
Example of sparse activations (sparse
coding)




                                       0 ]
                                       am [ 0 0 0 | | …

                                                              0 ]
                                                              a2 ] | 0 0 0 …
                                                              0 [|
                                                                           a1 [ 0 | | 0 0 …
            x1
             x2
       x3
                                                          …

                  xm                   a3 [ | 0 | 0 0 …
                                       0 ]



• different x has different dimensions activated

• locally-shared sparse representation: similar x’s tend to have
  similar non-zero dimensions

08/22/12                                                                                      31
Example of sparse activations (sparse
coding)




                                       0 ]
                                       am [ 0 0 0 | | …

                                                              0 ]
                                                              a2 ] | | 0 0 …
                                                              0 [0
                                                                           a1 [ | | 0 0 0 …
                x2
           x1        x3                                   …
                      xm               a3 [ 0 0 | | 0 …
                                       0 ]



  • another example: preserving manifold structure

  • more informative in highlighting richer data structures,
    i.e. clusters, manifolds,

08/22/12                                                                                      32
Outline

  1. Sparse coding for image classification
  2. Understanding sparse coding
       –     Connections to RBMs, autoencoders, …
       –     Sparse activations vs. sparse models, …
       –     Sparsity vs. locality
       –     Local sparse coding methods
  1. Hierarchical sparse coding
  2. Other topics: e.g. structured model, scale-up,
           discriminative training
  3.       Summary

08/22/12                                               33
Sparsity vs. Locality

                                 • Intuition: similar data should
                                   get similar activated features
           sparse coding
                                 • Local sparse coding:
                                    • data in the same
                                      neighborhood tend to
                  local sparse        have shared activated
                     coding           features;

                                    • data in different
                                      neighborhoods tend to
                                      have different features
                                      activated.
08/22/12                                                        34
Sparse coding is not always local: example




            Case 1                          Case 2
     independent subspaces          data manifold (or clusters)

 • Each basis is a “direction”   • Each basis an “anchor point”
 • Sparsity: each datum is a     • Sparsity: each datum is a
 linear combination of only      linear combination of neighbor
 several bases.                  anchors.
                                 • Sparsity is caused by locality.
 08/22/12
                                                                  35
Two approaches to local sparse coding




          Approach 1                     Approach 2
 Coding via local anchor points   Coding via local subspaces




 08/22/12
                                                               36
Classical sparse coding is empirically local




 When it works best for classification, the codes
     are often found local.

 It’s preferred to let similar data have similar
     non-zero dimensions in their codes.

08/22/12                                             37
MNIST Experiment: Classification using
 SC




                    Try different values


                                     • •60K training, 10K for
                                         60K training, 10K for
                                     test
                                      test
                                     • • k=512
                                       Let k=512
                                        Let
                                     ••
                                      Linear SVM on
                                        Linear SVM on
                                     sparse codes
                                      sparse codes



08/22/12                                                   38
MNIST Experiment: Lambda = 0.0005



Each basis is like a
Each basis is like a
 part or direction.
 part or direction.




08/22/12                             39
MNIST Experiment: Lambda = 0.005



Again, each basis is like
Again, each basis is like
  a part or direction.
   a part or direction.




  08/22/12                            40
MNIST Experiment: Lambda = 0.05



Now, each basis is
Now, each basis is
 more like a digit !!
 more like a digit




08/22/12                           41
MNIST Experiment: Lambda = 0.5



      Like VQ now!
      Like VQ now!




08/22/12                          42
Geometric view of sparse
coding




        Error: 4.54%       Error: 3.75%         Error: 2.64%


   • •When sparse coding achieves the best classification
      When sparse coding achieves the best classification
   accuracy, the learned bases are like digits ––each basis
    accuracy, the learned bases are like digits each basis
   has aaclear local class association.
    has clear local class association.


 08/22/12                                                      43
Distribution of coefficients (MNIST)




           Neighbor bases tend to
            Neighbor bases tend to
           get nonzero coefficients
           get nonzero coefficients




08/22/12                                44
Distribution of coefficient (SIFT,
 Caltech101)




           Similar observation here!
           Similar observation here!




08/22/12                               45
Outline

  1. Sparse coding for image classification
  2. Understanding sparse coding
       –     Connections to RBMs, autoencoders, …
       –     Sparse activations vs. sparse models, …
       –     Sparsity vs. locality
       –     Local sparse coding methods
  1. Other topics: e.g. structured model, scale-up,
           discriminative training
  2.       Summary


08/22/12                                               46
Why develop local sparse coding methods


 Since locality is a preferred property in sparse
     coding, let’s explicitly ensure the locality.

 The new algorithms can be well theoretically
     justified

 The new algorithms will have computational
     advantages over classical sparse coding



08/22/12                                             47
Two approaches to local sparse coding




          Approach 1                                                Approach 2
 Coding via local anchor points                              Coding via local subspaces

      Local coordinate coding                                      Super-vector coding
  Learning locality-constrained linear coding for image       Image Classification using Super-Vector Coding of
      classification, Jingjun Wang, Jianchao Yang, Kai Yu,    Local Image Descriptors, Xi Zhou, Kai Yu, Tong
      Fengjun Lv, Thomas Huang. In CVPR 2010.                 Zhang, and Thomas Huang. In ECCV 2010.

  Nonlinear learning using local coordinate coding,           Large-scale Image Classification: Fast Feature
  Kai Yu, Tong Zhang, and Yihong Gong. In NIPS 2009.          Extraction and SVM Training, Yuanqing Lin, Fengjun
                                                              Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai
                                                              Yu, LiangLiang Cao, Thomas Huang. In CVPR 2011



 08/22/12
                                                                                                                   48
A function approximation framework to
understand coding


                            • •Assumption: image
                                Assumption: image
                            patches xxfollow aanonlinear
                              patches follow nonlinear
                            manifold, and f(x) is smooth
                              manifold, and f(x) is smooth
                            on the manifold.
                              on the manifold.
                            • •Coding: nonlinear mapping
                                Coding: nonlinear mapping
                                         xx aa
                                            
                            typically, aais high-dim &
                              typically, is high-dim &
                            sparse
                              sparse
                            • •Nonlinear Learning:
                                Nonlinear Learning:
                                     f(x) ==<w, a>
                                      f(x) <w, a>



 08/22/12                                               49
Local sparse coding




                   Approach 1
             Local coordinate coding




 08/22/12
                                       50
Function Interpolation based on LCC
                                           Yu, Zhang, Gong, NIPS 10



            locally linear




                             data points
                             bases
 08/22/12                                                             51
Local Coordinate Coding (LCC):
connect coding to n onlinear function learning


If f(x) is (alpha, beta)-Lipschitz smooth
             The key message:

             A good coding scheme should
             1. have a small coding error,
             2. and also be sufficiently local
       Function               Coding error       Locality term
     approximation
         error


  08/22/12                                                 52
Local Coordinate Coding (LCC) Yu, Zhang & Gong, NIPS 09
                                  Wang, Yang, Yu, Lv, Huang CVPR 10


• •Dictionary Learning: k-means (or hierarchical k-means)
   Dictionary Learning: k-means (or hierarchical k-means)

• Coding for x, to obtain its sparse representation a

      Step 1 – ensure locality: find the K nearest bases


      Step 2 – ensure low coding error:




08/22/12                                                              53
Local sparse coding




                  Approach 2
               Super-vector coding




 08/22/12
                                     54
ecewise local linear (first-order)
Function approximation via super-vector coding:
                           Zhou, Yu, Zhang, and Huang, ECCV 10


        Local tangent




                    data points
                    cluster centers
Super-vector coding: Justification



If f(x) is beta-Lipschitz smooth, and


                                        Local tangent




             Function approximation error               Quantization
                                                           error


  08/22/12                                                         56
Super-Vector Coding (SVC)
                                 Zhou, Yu, Zhang, and Huang, ECCV 10


 • •Dictionary Learning: k-means (or hierarchical k-means)
    Dictionary Learning: k-means (or hierarchical k-means)

• Coding for x, to obtain its sparse representation a

      Step 1 – find the nearest basis of x, obtain its VQ coding

                       e.g. [0, 0, 1, 0, …]

      Step 2 – form super vector coding:

            e.g. [0, 0, 1, 0, …, 0, 0, (x-m 3 ), 0 ,… ]

                    Zero-order        Local tangent
08/22/12                                                               57
Results on ImageNet Challenge Dataset

      ImageNet Challenge:
 1.4 million images, 1000 classes
                                    40%
                                    VQ + Intersection Kernel


                                    62%
                                    LCC + Linear SVM


                                    65%
                                    SVC + Linear SVM

 08/22/12                                                 58
Summary: local sparse coding




            Approach 1                          Approach 2
      Local coordinate coding                Super-vector coding

     -      Sparsity achieved by explicitly ensuring locality
     -      Sound theoretical justifications
     -      Much simpler to implement and compute
     -      Strong empirical success

 08/22/12
                                                                   59
Outline

  1.       Sparse coding for image classification
  2.       Understanding sparse coding
  3.       Hierarchical sparse coding
  4.       Other topics: e.g. structured model, scale-up,
           discriminative training
  5.       Summary




08/22/12                                                    60
Hierarchical sparse coding
                                                 Yu, Lin, & Lafferty, CVPR 11
                Matthew D. Zeiler, Graham W. Taylor, and Rob Fergus, ICCV 11


              Learning from
              unlabeled data




      Sparse Coding        Pooling     Sparse Coding    Pooling




                                                                          61
1
           wi       αk    (φk )    wi =       S (W ) Ω(α )
    n i =1
A two-layer sparse coding + λ 1 W + γ α
    (W , α ) =
               k =1
                       L(W, α ) formulation
                                        1           1
               W ,α               n Yu, Lin, & Lafferty, CVPR 11
                       α 0,            n
                                                  ∈ λ
                                    1
                        S (= ) ≡
                  (W , α ) W               wW,iTαλ + p×1 p W 1 + γ α
                                          L( iw ) R                    1
             (W , α ) =             n((W,αα ) + 1 W 1 + γ α 1
                                W ,αL W, )
                                   L i =1               n
                           W ,α                   n
                                α 0,
                        1 α 0,
                   n
               1
                                              2 (W, α )
                                        2
                            xi − Bwi + λLwi Ω(α )wi .
               n i =1 2                      L(W, α )
                                       L(W, α )
                           n
              W = (w1 w2 · · n ) ∈2 Rp×λn2 2
                     n 11     ·1w
            L(W, α1 = α 1 RqBW −F Bwi + S2 Wi )Ω(α )wi
                  )         X − xi
                            ∈ xi2− Bwi + + λ 2 w λΩ(α )Ω(α ).
                                       2          (w               .
                      2n
                       n i =1             n     i      wi
                  n  i =1
                           2
                                q                −1
                    wi
                    Ω(αw ≡ww1 · 2 ·)·∈n()R∈Σ n α ).= Ω(α )− 1
                                                  p× n
                    W ) ( w α k · w φkp× (
                       =
                  W = ( 1 2 ·· n qw       ) R
                               α ∈
                          α ∈k =1q R
                          W R              S (W ) Ω(α )
              1                wi       α             −1
 08/22/12                               q   W         −1                   62
                                    q
MNIST Results - classification
                                            Yu, Lin, & Lafferty, CVPR 11




  HSC vs. CNN: HSC provide even better performance than CNN
        more amazingly, HSC learns features in unsupervised manner!

                                                                       63
MNIST results -- learned dictionary
                                         Yu, Lin, & Lafferty, CVPR 11


A hidden unit in the second layer is connected to a unit group in
the 1st layer: invariance to translation, rotation, and deformation




                                                                    64
Caltech101 results - classification
                                           Yu, Lin, & Lafferty, CVPR 11




    Learned descriptor: performs slightly better than SIFT + SC

                                                                      65
Adaptive Deconvolutional Networks for Mid
and High Level Feature Learning
                    Matthew D. Zeiler, Graham W. Taylor, and Rob Fergus, ICCV 2011


 Hierarchical              L4 Feature
  Convolutional Sparse        Maps
  Coding.                                                Select L4 Features
                           L3 Feature
 Trained with respect       Maps
  to image from all
                                                      Select L3 Feature Groups
  layers (L1-L4).
                            L2 Feature
 Pooling both spatially      Maps
  and amongst features.
 Learns invariant mid-     L1 Feature                Select L2 Feature Groups
  level features.             Maps

                              Image                         L1 Features
Outline

  1.       Sparse coding for image classification
  2.       Understanding sparse coding
  3.       Hierarchical sparse coding
  4.       Other topics: e.g. structured model, scale-up,
           discriminative training
  5.       Summary




08/22/12                                                    67
Other topics of sparse coding

       Structured sparse coding, for example
       – Group sparse coding [Bengio et al, NIPS 09]
       – Learning hierarchical dictionary [Jenatton, Mairal et al, 2010]

       Scale-up sparse coding, for example
       – Feature-sign algorithm [Lee et al, NIPS 07]
       – Feed-forward approximation [Gregor & LeCun, ICML 10]
       – Online dictionary learning [Mairal et al, ICML 2009]

       Discriminative training, for example
       – Backprop algorithms [Bradley & Jbagnell, NIPS 08; Yang et al.
           CVPR 10]
       – Supervised dictionary training [Mairal et al, NIPS08]

08/22/12                                                                   68
Summary of Sparse Coding

 Sparse coding is an effect way for (unsupervised)
  feature learning

 A building block for deep models

 Sparse coding and its local variants (LCC, SVC) have
  pushed the boundary of accuracies on Caltech101,
  PASCAL VOC, ImageNet, …

 Challenge: discriminative training is not straightforward

Contenu connexe

Similaire à P02 sparse coding cvpr2012 deep learning methods for vision

P04 restricted boltzmann machines cvpr2012 deep learning methods for vision
P04 restricted boltzmann machines cvpr2012 deep learning methods for visionP04 restricted boltzmann machines cvpr2012 deep learning methods for vision
P04 restricted boltzmann machines cvpr2012 deep learning methods for visionzukun
 
HUMAN FACE IDENTIFICATION
HUMAN FACE IDENTIFICATION HUMAN FACE IDENTIFICATION
HUMAN FACE IDENTIFICATION bhupesh lahare
 
Oscon miller 2011
Oscon miller 2011Oscon miller 2011
Oscon miller 2011Mike Miller
 
Object classification in far field and low resolution videos
Object classification in far field and low resolution videosObject classification in far field and low resolution videos
Object classification in far field and low resolution videosInsaf Setitra
 
MoDisco EclipseCon2010
MoDisco EclipseCon2010MoDisco EclipseCon2010
MoDisco EclipseCon2010fmadiot
 
A comprehensive formal verification solution for ARM based SOC design
A comprehensive formal verification solution for ARM based SOC design A comprehensive formal verification solution for ARM based SOC design
A comprehensive formal verification solution for ARM based SOC design chiportal
 
2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]imec.archive
 
2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]imec.archive
 
Reverse engineering
Reverse engineeringReverse engineering
Reverse engineeringSaswat Padhi
 
MongoDB for Java Developers with Spring Data
MongoDB for Java Developers with Spring DataMongoDB for Java Developers with Spring Data
MongoDB for Java Developers with Spring DataChris Richardson
 
Workshop OSGI PPT
Workshop OSGI PPTWorkshop OSGI PPT
Workshop OSGI PPTSummer Lu
 
Extreme Spatio-Temporal Data Analysis
Extreme Spatio-Temporal Data AnalysisExtreme Spatio-Temporal Data Analysis
Extreme Spatio-Temporal Data AnalysisJoel Saltz
 
Information from pixels
Information from pixelsInformation from pixels
Information from pixelsDave Snowdon
 
MongoDB for Java Devs with Spring Data - MongoPhilly 2011
MongoDB for Java Devs with Spring Data - MongoPhilly 2011MongoDB for Java Devs with Spring Data - MongoPhilly 2011
MongoDB for Java Devs with Spring Data - MongoPhilly 2011MongoDB
 
BlackHat EU 2012 - Zhenhua Liu - Breeding Sandworms: How To Fuzz Your Way Out...
BlackHat EU 2012 - Zhenhua Liu - Breeding Sandworms: How To Fuzz Your Way Out...BlackHat EU 2012 - Zhenhua Liu - Breeding Sandworms: How To Fuzz Your Way Out...
BlackHat EU 2012 - Zhenhua Liu - Breeding Sandworms: How To Fuzz Your Way Out...MindShare_kk
 
Upgrading to System Verilog for FPGA Designs, Srinivasan Venkataramanan, CVC
Upgrading to System Verilog for FPGA Designs, Srinivasan Venkataramanan, CVCUpgrading to System Verilog for FPGA Designs, Srinivasan Venkataramanan, CVC
Upgrading to System Verilog for FPGA Designs, Srinivasan Venkataramanan, CVCFPGA Central
 

Similaire à P02 sparse coding cvpr2012 deep learning methods for vision (20)

Vb
VbVb
Vb
 
P04 restricted boltzmann machines cvpr2012 deep learning methods for vision
P04 restricted boltzmann machines cvpr2012 deep learning methods for visionP04 restricted boltzmann machines cvpr2012 deep learning methods for vision
P04 restricted boltzmann machines cvpr2012 deep learning methods for vision
 
HUMAN FACE IDENTIFICATION
HUMAN FACE IDENTIFICATION HUMAN FACE IDENTIFICATION
HUMAN FACE IDENTIFICATION
 
Oscon miller 2011
Oscon miller 2011Oscon miller 2011
Oscon miller 2011
 
Object classification in far field and low resolution videos
Object classification in far field and low resolution videosObject classification in far field and low resolution videos
Object classification in far field and low resolution videos
 
Av Recognition
Av RecognitionAv Recognition
Av Recognition
 
MoDisco EclipseCon2010
MoDisco EclipseCon2010MoDisco EclipseCon2010
MoDisco EclipseCon2010
 
A comprehensive formal verification solution for ARM based SOC design
A comprehensive formal verification solution for ARM based SOC design A comprehensive formal verification solution for ARM based SOC design
A comprehensive formal verification solution for ARM based SOC design
 
2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]
 
2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]2008 brokerage 04 smart vision system [compatibility mode]
2008 brokerage 04 smart vision system [compatibility mode]
 
Ocr color
Ocr colorOcr color
Ocr color
 
Reverse engineering
Reverse engineeringReverse engineering
Reverse engineering
 
MongoDB for Java Developers with Spring Data
MongoDB for Java Developers with Spring DataMongoDB for Java Developers with Spring Data
MongoDB for Java Developers with Spring Data
 
Workshop OSGI PPT
Workshop OSGI PPTWorkshop OSGI PPT
Workshop OSGI PPT
 
Extreme Spatio-Temporal Data Analysis
Extreme Spatio-Temporal Data AnalysisExtreme Spatio-Temporal Data Analysis
Extreme Spatio-Temporal Data Analysis
 
Information from pixels
Information from pixelsInformation from pixels
Information from pixels
 
MongoDB for Java Devs with Spring Data - MongoPhilly 2011
MongoDB for Java Devs with Spring Data - MongoPhilly 2011MongoDB for Java Devs with Spring Data - MongoPhilly 2011
MongoDB for Java Devs with Spring Data - MongoPhilly 2011
 
BlackHat EU 2012 - Zhenhua Liu - Breeding Sandworms: How To Fuzz Your Way Out...
BlackHat EU 2012 - Zhenhua Liu - Breeding Sandworms: How To Fuzz Your Way Out...BlackHat EU 2012 - Zhenhua Liu - Breeding Sandworms: How To Fuzz Your Way Out...
BlackHat EU 2012 - Zhenhua Liu - Breeding Sandworms: How To Fuzz Your Way Out...
 
Upgrading to System Verilog for FPGA Designs, Srinivasan Venkataramanan, CVC
Upgrading to System Verilog for FPGA Designs, Srinivasan Venkataramanan, CVCUpgrading to System Verilog for FPGA Designs, Srinivasan Venkataramanan, CVC
Upgrading to System Verilog for FPGA Designs, Srinivasan Venkataramanan, CVC
 
Design and Concepts of Android Graphics
Design and Concepts of Android GraphicsDesign and Concepts of Android Graphics
Design and Concepts of Android Graphics
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Dernier

ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 

Dernier (20)

ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 

P02 sparse coding cvpr2012 deep learning methods for vision

  • 1. CVPR12 Tutorial on Deep Learning Sparse Coding Kai Yu yukai@baidu.com Department of Multimedia, Baidu
  • 2. Relentless research on visual recognition Caltech 101 80 Million Tiny Images ImageNet PASCAL VOC 08/22/12 2
  • 3. The pipeline of machine visual perception Most Efforts in Machine Learning Inference: Low-level Pre- Feature Feature prediction, sensing processing extract. selection recognition • Most critical for accuracy • Account for most of the computation for testing • Most time-consuming in development cycle • Often hand-craft in practice 08/22/12 3
  • 4. Computer vision features SIFT Spin image HoG RIFT GLOH Slide Credit: Andrew Ng
  • 5. Learning features from data Machine Learning Inference: Low-level Pre- Feature Feature prediction, sensing processing extract. selection recognition Feature Learning: instead of design features, let’s design feature learners 08/22/12 5
  • 6. Learning features from data via sparse coding Inference: Low-level Pre- Feature Feature prediction, sensing processing extract. selection recognition Sparse coding offers an effective building block to learn useful features 08/22/12 6
  • 7. Outline 1. Sparse coding for image classification 2. Understanding sparse coding 3. Hierarchical sparse coding 4. Other topics: e.g. structured model, scale-up, discriminative training 5. Summary 08/22/12 7
  • 8. “BoW representation + SPM” Paradigm - I Bag-of-visual-words representation (BoW) based on VQ coding 08/22/12 Figure credit: Fei-Fei Li 8
  • 9. “BoW representation + SPM” Paradigm - II Spatial pyramid matching: pooling in different scales and locations 08/22/12 Figure credit: Svetlana Lazebnik 9
  • 10. Image Classification using “BoW + SPM” Image Classification Spatial Pooling Dense SIFT VQ Coding Classifier 08/22/12 10
  • 11. The Architecture of “Coding + Pooling” Coding Pooling Coding Pooling • •e.g., e.g.,convolutional neural net, HMAX, BoW, … convolutional neural net, HMAX, BoW, … 11
  • 12. “BoW+SPM” has two coding+pooling layers Local Gradients Pooling VQ Coding Average Pooling SVM (obtain histogram) e.g, SIFT, HOG SIFT feature itself follows a coding+pooling operation SIFT feature itself follows a coding+pooling operation 12
  • 13. Develop better coding methods Better Coding Better Pooling Better Coding Better Pooling Better Classifier - Coding: nonlinear mapping data into another feature space - Better coding methods: sparse coding, RBMs, auto-encoders 13
  • 14. What is sparse coding Sparse coding (Olshausen & Field,1996). Originally developed to explain early visual processing in the brain (edge detection). Training: given a set of random patches x, learning a dictionary of bases [Φ1, Φ2, …] Coding: for data vector x, solve LASSO to find the sparse coefficient vector a 08/22/12 14
  • 15. Sparse coding: training time Input: Images x1, x2, …, xm (eachinRd) Learn: Dictionary of bases φ1, φ2, …, φk (alsoRd). Alternating optimization: 1.Fix dictionary φ1, φ2, …, φk , optimize a (a standard LASSO problem ) 2.Fix activations a, optimize dictionary φ1, φ2, …, φk (a convex QP problem)
  • 16. Sparse coding: testing time Input: Novel image patch x (inRd) and previously learned φi’s Output: Representation [ai,1, ai,2, …, ai,κ] of image patch xi. ≈ 0.8 * + 0.3 * + 0.5 * Represent xi as: ai = [0, 0, …, 0, 0.8, 0, …, 0, 0.3, 0, …, 0, 0.5, …]
  • 17. Sparse coding illustration 50 Natural Images Learned bases (φ1 , …, φ64): “Edges” 1 00 1 50 2 00 50 2 50 1 00 3 00 1 50 3 50 2 00 4 00 2 50 50 4 50 3 00 100 5 00 50 100 1 50 2 00 250 300 3 50 4 00 450 150 5 00 3 50 200 4 00 250 4 50 300 5 00 50 1 00 150 2350 00 250 3 00 3 50 4 00 4 50 500 400 450 500 50 100 150 200 250 3 00 350 400 450 5 00 Test example ≈ 0.8 * + 0.3 * + 0.5 * x ≈ 0, …, φ , + 0.3 φ42 + 0.5 [a1, …, a64] = [0, 0.8 *0, 0.8360, …, 0,0.3 *, 0, …, 0, 0.5, 0] * φ63 (feature representation) Slide credit: Andrew Ng Compact & easily interpretable
  • 18. Self-taught Learning [Raina, Lee, Battle, Packer & Ng, ICML 07] Motorcycles Not motorcycles Testing: Testing: What isis this? What this? … Unlabeled images Slide credit: Andrew Ng
  • 19. Classification Result on Caltech 101 9K images, 101 classes 64% SIFT VQ + Nonlinear SVM ~50% Pixel Sparse Coding + Linear SVM 08/22/12 19
  • 20. Sparse Coding on SIFT – ScSPM algorithm [Yang, Yu, Gong & Huang, CVPR09] Local Gradients Pooling Sparse Coding Max Pooling Linear Classifier e.g, SIFT, HOG 20
  • 21. Sparse Coding on SIFT – the ScSPM algorithm [Yang, Yu, Gong & Huang, CVPR09] Caltech-101 64% SIFT VQ + Nonlinear SVM 73% SIFT Sparse Coding + Linear SVM (ScSPM) 08/22/12 21
  • 22. 22 Dense SIFT 73% Sparse Coding Summary: Accuracies on Caltech 101 Spatial Max Pooling - Sparse coding is a better building block Linear SVM Sparse Coding 50% - Deep models are preferred Spatial Max Pooling Linear SVM Dense SIFT Key message: 64% VQ Coding Spatial Pooling 08/22/12 Nonlinear SVM
  • 23. Outline 1. Sparse coding for image classification 2. Understanding sparse coding 3. Hierarchical sparse coding 4. Other topics: e.g. structured model, scale-up, discriminative training 5. Summary 08/22/12 23
  • 24. Outline 1. Sparse coding for image classification 2. Understanding sparse coding – Connections to RBMs, autoencoders, … – Sparse activations vs. sparse models, .. – Sparsity vs. locality – local sparse coding methods 1. Hierarchical sparse coding 2. Other topics: e.g. structured model, scale-up, discriminative training 3. Summary 08/22/12 24
  • 25. Classical sparse coding - a is sparse - a is often higher dimension than x - Activation a=f(x) is nonlinear implicit function of x - reconstruction x’=g(a) is linear & explicit a f(x) encoding x’ decoding a g(a) x 08/22/12 25
  • 26. RBM & autoencoders - also involve activation and reconstruction - but have explicit f(x) - not necessarily enforce sparsity on a - but if put sparsity on a, often get improved results [e.g. sparse RBM, Lee et al. NIPS08] a x’ f(x) encoding decoding a g(a) x 08/22/12 26
  • 27. Sparse coding: A broader view Any feature mapping from x to a, i.e. a = f(x), where -a is sparse (and often higher dim. than x) -f(x) is nonlinear -reconstruction x’=g(a) , such that x’≈x a x’ f(x) a g(a) x Therefore, sparse RBMs, sparse auto-encoder, even VQ can be viewed as a form of sparse coding. 08/22/12 27
  • 28. Outline 1. Sparse coding for image classification 2. Understanding sparse coding – Connections to RBMs, autoencoders, … – Sparse activations vs. sparse models, … – Sparsity vs. locality – local sparse coding methods 1. Hierarchical sparse coding 2. Other topics: e.g. structured model, scale-up, discriminative training 3. Summary 08/22/12 28
  • 29. Sparse activations vs. sparse models For a general function learning problem a = f(x): 1. sparse model: f(x)’s parameters are sparse - example: LASSO f(x)=<w,x>, w is sparse - the goal is feature selection: all data selects a common subset of features - hot topic in machine learning 2. sparse activations: f(x)’s outputs are sparse - example: sparse coding a=f(x), a is sparse - the goal is feature learning: different data points activate different feature subsets 08/22/12 29
  • 30. Example of sparse models f(x)=<w,x>, where w=[0, 0.2, 0, 0.1, 0, 0] ] [ 0 | 0 | 0 0 ] [ 0 | 0 | 0 0 ] [ 0 | 0 | 0 0 | ] xm [ | | | | | | ][ | | | | | x1 [ | | | | | x2 | ] [ 0 | 0 | 0 0 … x3 [ | | | | | | ] … ] • because the 2nd and 4th elements of w are non-zero, these are the two selected features in x • globally-aligned sparse representation 08/22/12 30
  • 31. Example of sparse activations (sparse coding) 0 ] am [ 0 0 0 | | … 0 ] a2 ] | 0 0 0 … 0 [| a1 [ 0 | | 0 0 … x1 x2 x3 … xm a3 [ | 0 | 0 0 … 0 ] • different x has different dimensions activated • locally-shared sparse representation: similar x’s tend to have similar non-zero dimensions 08/22/12 31
  • 32. Example of sparse activations (sparse coding) 0 ] am [ 0 0 0 | | … 0 ] a2 ] | | 0 0 … 0 [0 a1 [ | | 0 0 0 … x2 x1 x3 … xm a3 [ 0 0 | | 0 … 0 ] • another example: preserving manifold structure • more informative in highlighting richer data structures, i.e. clusters, manifolds, 08/22/12 32
  • 33. Outline 1. Sparse coding for image classification 2. Understanding sparse coding – Connections to RBMs, autoencoders, … – Sparse activations vs. sparse models, … – Sparsity vs. locality – Local sparse coding methods 1. Hierarchical sparse coding 2. Other topics: e.g. structured model, scale-up, discriminative training 3. Summary 08/22/12 33
  • 34. Sparsity vs. Locality • Intuition: similar data should get similar activated features sparse coding • Local sparse coding: • data in the same neighborhood tend to local sparse have shared activated coding features; • data in different neighborhoods tend to have different features activated. 08/22/12 34
  • 35. Sparse coding is not always local: example Case 1 Case 2 independent subspaces data manifold (or clusters) • Each basis is a “direction” • Each basis an “anchor point” • Sparsity: each datum is a • Sparsity: each datum is a linear combination of only linear combination of neighbor several bases. anchors. • Sparsity is caused by locality. 08/22/12 35
  • 36. Two approaches to local sparse coding Approach 1 Approach 2 Coding via local anchor points Coding via local subspaces 08/22/12 36
  • 37. Classical sparse coding is empirically local  When it works best for classification, the codes are often found local.  It’s preferred to let similar data have similar non-zero dimensions in their codes. 08/22/12 37
  • 38. MNIST Experiment: Classification using SC Try different values • •60K training, 10K for 60K training, 10K for test test • • k=512 Let k=512 Let •• Linear SVM on Linear SVM on sparse codes sparse codes 08/22/12 38
  • 39. MNIST Experiment: Lambda = 0.0005 Each basis is like a Each basis is like a part or direction. part or direction. 08/22/12 39
  • 40. MNIST Experiment: Lambda = 0.005 Again, each basis is like Again, each basis is like a part or direction. a part or direction. 08/22/12 40
  • 41. MNIST Experiment: Lambda = 0.05 Now, each basis is Now, each basis is more like a digit !! more like a digit 08/22/12 41
  • 42. MNIST Experiment: Lambda = 0.5 Like VQ now! Like VQ now! 08/22/12 42
  • 43. Geometric view of sparse coding Error: 4.54% Error: 3.75% Error: 2.64% • •When sparse coding achieves the best classification When sparse coding achieves the best classification accuracy, the learned bases are like digits ––each basis accuracy, the learned bases are like digits each basis has aaclear local class association. has clear local class association. 08/22/12 43
  • 44. Distribution of coefficients (MNIST) Neighbor bases tend to Neighbor bases tend to get nonzero coefficients get nonzero coefficients 08/22/12 44
  • 45. Distribution of coefficient (SIFT, Caltech101) Similar observation here! Similar observation here! 08/22/12 45
  • 46. Outline 1. Sparse coding for image classification 2. Understanding sparse coding – Connections to RBMs, autoencoders, … – Sparse activations vs. sparse models, … – Sparsity vs. locality – Local sparse coding methods 1. Other topics: e.g. structured model, scale-up, discriminative training 2. Summary 08/22/12 46
  • 47. Why develop local sparse coding methods  Since locality is a preferred property in sparse coding, let’s explicitly ensure the locality.  The new algorithms can be well theoretically justified  The new algorithms will have computational advantages over classical sparse coding 08/22/12 47
  • 48. Two approaches to local sparse coding Approach 1 Approach 2 Coding via local anchor points Coding via local subspaces Local coordinate coding Super-vector coding Learning locality-constrained linear coding for image Image Classification using Super-Vector Coding of classification, Jingjun Wang, Jianchao Yang, Kai Yu, Local Image Descriptors, Xi Zhou, Kai Yu, Tong Fengjun Lv, Thomas Huang. In CVPR 2010. Zhang, and Thomas Huang. In ECCV 2010. Nonlinear learning using local coordinate coding, Large-scale Image Classification: Fast Feature Kai Yu, Tong Zhang, and Yihong Gong. In NIPS 2009. Extraction and SVM Training, Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu, LiangLiang Cao, Thomas Huang. In CVPR 2011 08/22/12 48
  • 49. A function approximation framework to understand coding • •Assumption: image Assumption: image patches xxfollow aanonlinear patches follow nonlinear manifold, and f(x) is smooth manifold, and f(x) is smooth on the manifold. on the manifold. • •Coding: nonlinear mapping Coding: nonlinear mapping xx aa  typically, aais high-dim & typically, is high-dim & sparse sparse • •Nonlinear Learning: Nonlinear Learning: f(x) ==<w, a> f(x) <w, a> 08/22/12 49
  • 50. Local sparse coding Approach 1 Local coordinate coding 08/22/12 50
  • 51. Function Interpolation based on LCC Yu, Zhang, Gong, NIPS 10 locally linear data points bases 08/22/12 51
  • 52. Local Coordinate Coding (LCC): connect coding to n onlinear function learning If f(x) is (alpha, beta)-Lipschitz smooth The key message: A good coding scheme should 1. have a small coding error, 2. and also be sufficiently local Function Coding error Locality term approximation error 08/22/12 52
  • 53. Local Coordinate Coding (LCC) Yu, Zhang & Gong, NIPS 09 Wang, Yang, Yu, Lv, Huang CVPR 10 • •Dictionary Learning: k-means (or hierarchical k-means) Dictionary Learning: k-means (or hierarchical k-means) • Coding for x, to obtain its sparse representation a Step 1 – ensure locality: find the K nearest bases Step 2 – ensure low coding error: 08/22/12 53
  • 54. Local sparse coding Approach 2 Super-vector coding 08/22/12 54
  • 55. ecewise local linear (first-order) Function approximation via super-vector coding: Zhou, Yu, Zhang, and Huang, ECCV 10 Local tangent data points cluster centers
  • 56. Super-vector coding: Justification If f(x) is beta-Lipschitz smooth, and Local tangent Function approximation error Quantization error 08/22/12 56
  • 57. Super-Vector Coding (SVC) Zhou, Yu, Zhang, and Huang, ECCV 10 • •Dictionary Learning: k-means (or hierarchical k-means) Dictionary Learning: k-means (or hierarchical k-means) • Coding for x, to obtain its sparse representation a Step 1 – find the nearest basis of x, obtain its VQ coding e.g. [0, 0, 1, 0, …] Step 2 – form super vector coding: e.g. [0, 0, 1, 0, …, 0, 0, (x-m 3 ), 0 ,… ] Zero-order Local tangent 08/22/12 57
  • 58. Results on ImageNet Challenge Dataset ImageNet Challenge: 1.4 million images, 1000 classes 40% VQ + Intersection Kernel 62% LCC + Linear SVM 65% SVC + Linear SVM 08/22/12 58
  • 59. Summary: local sparse coding Approach 1 Approach 2 Local coordinate coding Super-vector coding - Sparsity achieved by explicitly ensuring locality - Sound theoretical justifications - Much simpler to implement and compute - Strong empirical success 08/22/12 59
  • 60. Outline 1. Sparse coding for image classification 2. Understanding sparse coding 3. Hierarchical sparse coding 4. Other topics: e.g. structured model, scale-up, discriminative training 5. Summary 08/22/12 60
  • 61. Hierarchical sparse coding Yu, Lin, & Lafferty, CVPR 11 Matthew D. Zeiler, Graham W. Taylor, and Rob Fergus, ICCV 11 Learning from unlabeled data Sparse Coding Pooling Sparse Coding Pooling 61
  • 62. 1 wi αk (φk ) wi = S (W ) Ω(α ) n i =1 A two-layer sparse coding + λ 1 W + γ α (W , α ) = k =1 L(W, α ) formulation 1 1 W ,α n Yu, Lin, & Lafferty, CVPR 11 α 0, n ∈ λ 1 S (= ) ≡ (W , α ) W wW,iTαλ + p×1 p W 1 + γ α L( iw ) R 1 (W , α ) = n((W,αα ) + 1 W 1 + γ α 1 W ,αL W, ) L i =1 n W ,α n α 0, 1 α 0, n 1 2 (W, α ) 2 xi − Bwi + λLwi Ω(α )wi . n i =1 2 L(W, α ) L(W, α ) n W = (w1 w2 · · n ) ∈2 Rp×λn2 2 n 11 ·1w L(W, α1 = α 1 RqBW −F Bwi + S2 Wi )Ω(α )wi ) X − xi ∈ xi2− Bwi + + λ 2 w λΩ(α )Ω(α ). 2 (w . 2n n i =1 n i wi n i =1 2 q −1 wi Ω(αw ≡ww1 · 2 ·)·∈n()R∈Σ n α ).= Ω(α )− 1 p× n W ) ( w α k · w φkp× ( = W = ( 1 2 ·· n qw ) R α ∈ α ∈k =1q R W R S (W ) Ω(α ) 1 wi α −1 08/22/12 q W −1 62 q
  • 63. MNIST Results - classification Yu, Lin, & Lafferty, CVPR 11 HSC vs. CNN: HSC provide even better performance than CNN  more amazingly, HSC learns features in unsupervised manner! 63
  • 64. MNIST results -- learned dictionary Yu, Lin, & Lafferty, CVPR 11 A hidden unit in the second layer is connected to a unit group in the 1st layer: invariance to translation, rotation, and deformation 64
  • 65. Caltech101 results - classification Yu, Lin, & Lafferty, CVPR 11 Learned descriptor: performs slightly better than SIFT + SC 65
  • 66. Adaptive Deconvolutional Networks for Mid and High Level Feature Learning Matthew D. Zeiler, Graham W. Taylor, and Rob Fergus, ICCV 2011  Hierarchical L4 Feature Convolutional Sparse Maps Coding. Select L4 Features L3 Feature  Trained with respect Maps to image from all Select L3 Feature Groups layers (L1-L4). L2 Feature  Pooling both spatially Maps and amongst features.  Learns invariant mid- L1 Feature Select L2 Feature Groups level features. Maps Image L1 Features
  • 67. Outline 1. Sparse coding for image classification 2. Understanding sparse coding 3. Hierarchical sparse coding 4. Other topics: e.g. structured model, scale-up, discriminative training 5. Summary 08/22/12 67
  • 68. Other topics of sparse coding  Structured sparse coding, for example – Group sparse coding [Bengio et al, NIPS 09] – Learning hierarchical dictionary [Jenatton, Mairal et al, 2010]  Scale-up sparse coding, for example – Feature-sign algorithm [Lee et al, NIPS 07] – Feed-forward approximation [Gregor & LeCun, ICML 10] – Online dictionary learning [Mairal et al, ICML 2009]  Discriminative training, for example – Backprop algorithms [Bradley & Jbagnell, NIPS 08; Yang et al. CVPR 10] – Supervised dictionary training [Mairal et al, NIPS08] 08/22/12 68
  • 69. Summary of Sparse Coding  Sparse coding is an effect way for (unsupervised) feature learning  A building block for deep models  Sparse coding and its local variants (LCC, SVC) have pushed the boundary of accuracies on Caltech101, PASCAL VOC, ImageNet, …  Challenge: discriminative training is not straightforward

Notes de l'éditeur

  1. Let’s further check what’s happening when best classification performance is achieved.
  2. Hi I’m here to talk about my poster “Adaptive Deconvolutional for Mid and High Level Feature Learning” This is an extension to previous work on Deconvolutional Networks released at CVPR last year. Deconv Nets are a hierarchical convolutional sparse coding model used to learn features from images in an unsupervised fashion. We added two important extensions: - the first novel extension is to train the model with respect to the input image from any given layer. - other hierarchical models build on the layer below - allows stacking the model while still maintaining strong reconstructions - the second extension is a pooling operation that acts both spatially and amongst features - we store the switch locations within each pooling region in order to preserve sharp reconstructions through multiple layers - pooling over features provides more competition amongst features and therefore forces groupings to occur