SlideShare une entreprise Scribd logo
1  sur  82
Computer vision: models,
 learning and inference
           Chapter 9
     Classification models



    Please send errata to s.prince@cs.ucl.ac.uk
Structure
•   Logistic regression
•   Bayesian logistic regression
•   Non-linear logistic regression
•   Kernelization and Gaussian process classification
•   Incremental fitting, boosting and trees
•   Multi-class classification
•   Random classification trees
•   Non-probabilistic classification
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   2
Models for machine vision




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   3
Example application:
     Gender Classification




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   4
Type 1: Model Pr(w|x) -
                Discriminative
How to model Pr(w|x)?
  – Choose an appropriate form for Pr(w)
  – Make parameters a function of x
  – Function takes parameters q that define its shape

Learning algorithm: learn parameters q from training data x,w
Inference algorithm: just evaluate Pr(w|x)




            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   5
Logistic Regression
Consider two class problem.
  • Choose Bernoulli distribution over world.
  • Make parameter l a function of x


Model activation with a linear function



creates number between                                        . Maps to               with


           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince          6
Two parameters

Learning by standard methods (ML,MAP, Bayesian)
Inference: Just evaluate Pr(w|x)
        Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   7
Neater Notation


To make notation easier to handle, we
   • Attach a 1 to the start of every data vector


   • Attach the offset to the start of the gradient vector f


New model:



          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   8
Logistic regression




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   9
Maximum Likelihood



Take Logarithm



Take derivative:


           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   10
Derivatives



Unfortunately, there is no closed form solution– we cannot
get and expression for f in terms of x and w

Have to use a general purpose technique:

       “iterative non-linear optimization”


            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   11
Optimization
Goal:




How can we find the minimum?
                                                                                   Cost function or
                                                                                   Objective function
Basic idea:
   • Start with estimate
   • Take a series of small steps to
   • Make sure that each step decreases cost
   • When can’t improve then must be in minimum
        Computer vision: models, learning and inference. ©2011 Simon J.D. Prince                  12
Local Minima




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   13
Convexity




If a function is convex, then it has only a single minimum
Can tell if a function is convex by looking at 2nd derivatives
         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   14
Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   15
Gradient Based Optimization

• Choose a search direction s based on the local properties
  of the function

• Perform an intensive search along the chosen direction.
  This is called line search



• Then set



         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   16
Gradient Descent

Consider standing on a hillside

Look at gradient where you are
standing

Find the steepest direction
downhill

Walk in that direction for some
distance (line search)


           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   17
Finite differences
What if we can’t compute the gradient?

Compute finite difference approximation:




where ej is the unit vector in the jth direction

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   18
Steepest Descent Problems
                                                                  Close up




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   19
Second Derivatives




In higher dimensions, 2nd derivatives change how much we should move differently
 in the different directions: changes best direction to move in.
              Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   20
Newton’s Method
Approximate function with Taylor expansion



Take derivative



Re-arrange



 Adding line search


           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   21
Newton’s Method




                                              Matrix of second derivatives is
                                              called the Hessian.

                                              Expensive to compute via
                                              finite differences.

                                              If positive definite then convex
Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   22
Newton vs. Steepest Descent




   Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   23
Line Search




  Gradually narrow down range

Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   24
Optimization for Logistic Regression




 Derivatives of log likelihood:




                                                                                  Positive definite!
       Computer vision: models, learning and inference. ©2011 Simon J.D. Prince              25
Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   26
Maximum likelihood fits




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   27
Structure
•   Logistic regression
•   Bayesian logistic regression
•   Non-linear logistic regression
•   Kernelization and Gaussian process classification
•   Incremental fitting, boosting and trees
•   Multi-class classification
•   Random classification trees
•   Non-probabilistic classification
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   28
Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   29
Bayesian Logistic Regression
Likelihood:



Prior (no conjugate):



Apply Bayes’ rule:



                                                          (no closed form solution for posterior)
              Computer vision: models, learning and inference. ©2011 Simon J.D. Prince          30
Laplace Approximation




Approximate posterior distribution with normal
   • Set mean to MAP estimate
   • Set covariance to matches that at MAP estimate
          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   31
Laplace Approximation
Find MAP solution by optimizing log likelihood




 Approximate with normal



 where




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   32
Laplace Approximation




Prior                          Actual posterior                                    Approximated



        Computer vision: models, learning and inference. ©2011 Simon J.D. Prince              33
Inference



Can re-express in terms of activation



Using transformation properties of normal distributions




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   34
Approximation of Integral




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   35
Bayesian Solution




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   36
Structure
•   Logistic regression
•   Bayesian logistic regression
•   Non-linear logistic regression
•   Kernelization and Gaussian process classification
•   Incremental fitting, boosting and trees
•   Multi-class classification
•   Random classification trees
•   Non-probabilistic classification
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   37
Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   38
Non-linear regression
Same idea as for regression.
• Apply non-linear transformation


• Build model as usual




         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   39
Non-linear regression
Example transformations:




Fit using optimization (also transformation parameters):




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   40
Non-linear regression in 1D




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   41
Non-linear regression in 2D




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   42
Structure
•   Logistic regression
•   Bayesian logistic regression
•   Non-linear logistic regression
•   Kernelization and Gaussian process classification
•   Incremental fitting, boosting and trees
•   Multi-class classification
•   Random classification trees
•   Non-probabilistic classification
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   43
Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   44
Dual Logistic Regression
KEY IDEA:

Gradient F is just a vector in
the data space

Can represent as a weighted
sum of the data points



Now solve for Y. One
parameter per training
example.
             Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   45
Maximum Likelihood
Likelihood



Derivatives




Depend only depend on inner products!
         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   46
Kernel Logistic Regression




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   47
ML vs. Bayesian




Bayesian case is known as Gaussian process classification
         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   48
Relevance vector classification
Apply sparse prior to dual variables:




As before, write as marginalization of dual variables:




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   49
Relevance vector classification
Apply sparse prior to dual variables:




Gives likelihood:




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   50
Relevance vector classification


Use Laplace approximation result:




giving:




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   51
Relevance vector classification
Previous result:




Second approximation:




To solve, alternately update hidden variables in H and mean and
variance of Laplace approximation.

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   52
Relevance vector classification
                                                     Results:

                                                     Most hidden variables
                                                     increase to larger values
                                                     This means prior over dual
                                                     variable is very tight around
                                                     zero
                                                     The final solution only
                                                     depends on a very small
                                                     number of examples –
                                                     efficient
    Computer vision: models, learning and inference. ©2011 Simon J.D. Prince     53
Structure
•   Logistic regression
•   Bayesian logistic regression
•   Non-linear logistic regression
•   Kernelization and Gaussian process classification
•   Incremental fitting & boosting
•   Multi-class classification
•   Random classification trees
•   Non-probabilistic classification
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   54
Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   55
Incremental Fitting
Previously wrote:


Now write:




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   56
Incremental Fitting
KEY IDEA:       Greedily add terms one at a time.
STAGE 1:        Fit f0, f1, x1



STAGE 2:        Fit f0, f2, x2



STAGE K:          Fit f0, fk, xk



            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   57
Incremental Fitting




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   58
Derivative
It is worth considering the form of the derivative in the
context of the incremental fitting procedure




                       Actual label                                   Predicted Label

Points contribute to derivative more if they are still
misclassified: the later classifiers become increasingly
specialized to the difficult examples.

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince     59
Boosting
Incremental fitting with step functions




Each step function is called a ``weak classifier``

Can`t take derivative w.r.t a so have to just use
exhaustive search


          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   60
Boosting




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   61
Branching Logistic Regression
A different way to make non-linear classifiers


New activation




The term                     is a gating function.

•   Returns a number between 0 and
•   If 0 then we get one logistic regression model
•   If 1 then get a different logistic regression model
           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   62
Branching Logistic Regression




   Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   63
Logistic Classification Trees




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   64
Structure
•   Logistic regression
•   Bayesian logistic regression
•   Non-linear logistic regression
•   Kernelization and Gaussian process classification
•   Incremental fitting, boosting and trees
•   Multi-class classification
•   Random classification trees
•   Non-probabilistic classification
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   65
Multiclass Logistic Regression
For multiclass recognition, choose distribution over w and
then make the parameters of this a function of x.


Softmax function maps real activations {ak} to numbers
between zero and one that sum to one



Parameters are vectors {fk}



         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   66
Multiclass Logistic Regression
Softmax function maps activations which can take any value to
parameters of categorical distribution between 0 and 1




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   67
Multiclass Logistic Regression
To learn model, maximize log likelihood



No closed from solution, learn with non-linear optimization




where



          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   68
Structure
•   Logistic regression
•   Bayesian logistic regression
•   Non-linear logistic regression
•   Kernelization and Gaussian process classification
•   Incremental fitting, boosting and trees
•   Multi-class classification
•   Random classification trees
•   Non-probabilistic classification
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   69
Random classification tree
Key idea:
   • Binary tree
   • Randomly chosen function at each split
   • Choose threshold t to maximize log probability




For given threshold, can compute parameters in closed form




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   70
Random classification tree
Related models:

Fern:
   • A tree where all of the functions at a level are the same
   • Threshold may be same or different
   • Very efficient to implement

Forest
   • Collection of trees
   • Average results to get more robust answer
   • Similar to `Bayesian’ approach – average of models with
       different parameters

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   71
Structure
•   Logistic regression
•   Bayesian logistic regression
•   Non-linear logistic regression
•   Kernelization and Gaussian process classification
•   Incremental fitting, boosting and trees
•   Multi-class classification
•   Random classification trees
•   Non-probabilistic classification
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   72
Non-probabilistic classifiers
Most people use non-probabilistic classification methods such
as neural networks, adaboost, support vector machines. This is
largely for historical reasons

Probabilistic approaches:
   • No serious disadvantages
   • Naturally produce estimates of uncertainty
   • Easily extensible to multi-class case
   • Easily related to each other




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   73
Non-probabilistic classifiers
Multi-layer perceptron (neural network)
    • Non-linear logistic regression with sigmoid functions
    • Learning known as back propagation
    • Transformed variable z is hidden layer
Adaboost
   • Very closely related to logitboost
   • Performance very similar
Support vector machines
    • Similar to relevance vector classification but objective fn is convex
    • No certainty
    • Not easily extended to multi-class
    • Produces solutions that are less sparse
    • More restrictions on kernel function


             Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   74
Structure
•   Logistic regression
•   Bayesian logistic regression
•   Non-linear logistic regression
•   Kernelization and Gaussian process classification
•   Incremental fitting, boosting and trees
•   Multi-class classification
•   Random classification trees
•   Non-probabilistic classification
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   75
Gender Classification



Incremental logistic regression



300 arc tan basis functions:

Results: 87.5% (humans=95%)

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   76
Fast Face Detection
  (Viola and Jones 2001)




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   77
Computing Haar Features




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   78
Pedestrian Detection




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   79
Semantic segmentation




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   80
Recovering surface layout




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   81
Recovering body pose




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   82

Contenu connexe

Similaire à Classification Models and Learning Algorithms

07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densitieszukun
 
04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_modelszukun
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameraszukun
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and gridspotaters
 
20 cv mil_models_for_words
20 cv mil_models_for_words20 cv mil_models_for_words
20 cv mil_models_for_wordszukun
 
03 cv mil_probability_distributions
03 cv mil_probability_distributions03 cv mil_probability_distributions
03 cv mil_probability_distributionszukun
 
18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identityzukun
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inferencezukun
 
12 cv mil_models_for_grids
12 cv mil_models_for_grids12 cv mil_models_for_grids
12 cv mil_models_for_gridszukun
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability DistibutionLukas Tencer
 
02 cv mil_intro_to_probability
02 cv mil_intro_to_probability02 cv mil_intro_to_probability
02 cv mil_intro_to_probabilityzukun
 
19 cv mil_temporal_models
19 cv mil_temporal_models19 cv mil_temporal_models
19 cv mil_temporal_modelszukun
 
05 cv mil_normal_distribution
05 cv mil_normal_distribution05 cv mil_normal_distribution
05 cv mil_normal_distributionzukun
 
Linear models for data science
Linear models for data scienceLinear models for data science
Linear models for data scienceBrad Klingenberg
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter? CS, NcState
 
lecture15-neural-nets (2).pptx
lecture15-neural-nets (2).pptxlecture15-neural-nets (2).pptx
lecture15-neural-nets (2).pptxanjithaba
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Large Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfLarge Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfSamuCerezo
 
Learning visual representation without human label
Learning visual representation without human labelLearning visual representation without human label
Learning visual representation without human labelKai-Wen Zhao
 

Similaire à Classification Models and Learning Algorithms (20)

07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities
 
04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameras
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and grids
 
20 cv mil_models_for_words
20 cv mil_models_for_words20 cv mil_models_for_words
20 cv mil_models_for_words
 
03 cv mil_probability_distributions
03 cv mil_probability_distributions03 cv mil_probability_distributions
03 cv mil_probability_distributions
 
18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identity
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inference
 
12 cv mil_models_for_grids
12 cv mil_models_for_grids12 cv mil_models_for_grids
12 cv mil_models_for_grids
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability Distibution
 
02 cv mil_intro_to_probability
02 cv mil_intro_to_probability02 cv mil_intro_to_probability
02 cv mil_intro_to_probability
 
19 cv mil_temporal_models
19 cv mil_temporal_models19 cv mil_temporal_models
19 cv mil_temporal_models
 
05 cv mil_normal_distribution
05 cv mil_normal_distribution05 cv mil_normal_distribution
05 cv mil_normal_distribution
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
 
Linear models for data science
Linear models for data scienceLinear models for data science
Linear models for data science
 
What Metrics Matter?
What Metrics Matter? What Metrics Matter?
What Metrics Matter?
 
lecture15-neural-nets (2).pptx
lecture15-neural-nets (2).pptxlecture15-neural-nets (2).pptx
lecture15-neural-nets (2).pptx
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
 
Large Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfLarge Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdf
 
Learning visual representation without human label
Learning visual representation without human labelLearning visual representation without human label
Learning visual representation without human label
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Dernier

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 

Dernier (20)

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 

Classification Models and Learning Algorithms

  • 1. Computer vision: models, learning and inference Chapter 9 Classification models Please send errata to s.prince@cs.ucl.ac.uk
  • 2. Structure • Logistic regression • Bayesian logistic regression • Non-linear logistic regression • Kernelization and Gaussian process classification • Incremental fitting, boosting and trees • Multi-class classification • Random classification trees • Non-probabilistic classification • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 2
  • 3. Models for machine vision Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 3
  • 4. Example application: Gender Classification Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 4
  • 5. Type 1: Model Pr(w|x) - Discriminative How to model Pr(w|x)? – Choose an appropriate form for Pr(w) – Make parameters a function of x – Function takes parameters q that define its shape Learning algorithm: learn parameters q from training data x,w Inference algorithm: just evaluate Pr(w|x) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 5
  • 6. Logistic Regression Consider two class problem. • Choose Bernoulli distribution over world. • Make parameter l a function of x Model activation with a linear function creates number between . Maps to with Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 6
  • 7. Two parameters Learning by standard methods (ML,MAP, Bayesian) Inference: Just evaluate Pr(w|x) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 7
  • 8. Neater Notation To make notation easier to handle, we • Attach a 1 to the start of every data vector • Attach the offset to the start of the gradient vector f New model: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 8
  • 9. Logistic regression Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 9
  • 10. Maximum Likelihood Take Logarithm Take derivative: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 10
  • 11. Derivatives Unfortunately, there is no closed form solution– we cannot get and expression for f in terms of x and w Have to use a general purpose technique: “iterative non-linear optimization” Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 11
  • 12. Optimization Goal: How can we find the minimum? Cost function or Objective function Basic idea: • Start with estimate • Take a series of small steps to • Make sure that each step decreases cost • When can’t improve then must be in minimum Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 12
  • 13. Local Minima Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 13
  • 14. Convexity If a function is convex, then it has only a single minimum Can tell if a function is convex by looking at 2nd derivatives Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 14
  • 15. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 15
  • 16. Gradient Based Optimization • Choose a search direction s based on the local properties of the function • Perform an intensive search along the chosen direction. This is called line search • Then set Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 16
  • 17. Gradient Descent Consider standing on a hillside Look at gradient where you are standing Find the steepest direction downhill Walk in that direction for some distance (line search) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 17
  • 18. Finite differences What if we can’t compute the gradient? Compute finite difference approximation: where ej is the unit vector in the jth direction Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 18
  • 19. Steepest Descent Problems Close up Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 19
  • 20. Second Derivatives In higher dimensions, 2nd derivatives change how much we should move differently in the different directions: changes best direction to move in. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 20
  • 21. Newton’s Method Approximate function with Taylor expansion Take derivative Re-arrange Adding line search Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 21
  • 22. Newton’s Method Matrix of second derivatives is called the Hessian. Expensive to compute via finite differences. If positive definite then convex Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 22
  • 23. Newton vs. Steepest Descent Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 23
  • 24. Line Search Gradually narrow down range Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 24
  • 25. Optimization for Logistic Regression Derivatives of log likelihood: Positive definite! Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 25
  • 26. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 26
  • 27. Maximum likelihood fits Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 27
  • 28. Structure • Logistic regression • Bayesian logistic regression • Non-linear logistic regression • Kernelization and Gaussian process classification • Incremental fitting, boosting and trees • Multi-class classification • Random classification trees • Non-probabilistic classification • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 28
  • 29. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 29
  • 30. Bayesian Logistic Regression Likelihood: Prior (no conjugate): Apply Bayes’ rule: (no closed form solution for posterior) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 30
  • 31. Laplace Approximation Approximate posterior distribution with normal • Set mean to MAP estimate • Set covariance to matches that at MAP estimate Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 31
  • 32. Laplace Approximation Find MAP solution by optimizing log likelihood Approximate with normal where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 32
  • 33. Laplace Approximation Prior Actual posterior Approximated Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 33
  • 34. Inference Can re-express in terms of activation Using transformation properties of normal distributions Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 34
  • 35. Approximation of Integral Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 35
  • 36. Bayesian Solution Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 36
  • 37. Structure • Logistic regression • Bayesian logistic regression • Non-linear logistic regression • Kernelization and Gaussian process classification • Incremental fitting, boosting and trees • Multi-class classification • Random classification trees • Non-probabilistic classification • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 37
  • 38. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 38
  • 39. Non-linear regression Same idea as for regression. • Apply non-linear transformation • Build model as usual Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 39
  • 40. Non-linear regression Example transformations: Fit using optimization (also transformation parameters): Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 40
  • 41. Non-linear regression in 1D Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 41
  • 42. Non-linear regression in 2D Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 42
  • 43. Structure • Logistic regression • Bayesian logistic regression • Non-linear logistic regression • Kernelization and Gaussian process classification • Incremental fitting, boosting and trees • Multi-class classification • Random classification trees • Non-probabilistic classification • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 43
  • 44. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 44
  • 45. Dual Logistic Regression KEY IDEA: Gradient F is just a vector in the data space Can represent as a weighted sum of the data points Now solve for Y. One parameter per training example. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 45
  • 46. Maximum Likelihood Likelihood Derivatives Depend only depend on inner products! Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 46
  • 47. Kernel Logistic Regression Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 47
  • 48. ML vs. Bayesian Bayesian case is known as Gaussian process classification Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 48
  • 49. Relevance vector classification Apply sparse prior to dual variables: As before, write as marginalization of dual variables: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 49
  • 50. Relevance vector classification Apply sparse prior to dual variables: Gives likelihood: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 50
  • 51. Relevance vector classification Use Laplace approximation result: giving: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 51
  • 52. Relevance vector classification Previous result: Second approximation: To solve, alternately update hidden variables in H and mean and variance of Laplace approximation. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 52
  • 53. Relevance vector classification Results: Most hidden variables increase to larger values This means prior over dual variable is very tight around zero The final solution only depends on a very small number of examples – efficient Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 53
  • 54. Structure • Logistic regression • Bayesian logistic regression • Non-linear logistic regression • Kernelization and Gaussian process classification • Incremental fitting & boosting • Multi-class classification • Random classification trees • Non-probabilistic classification • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 54
  • 55. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 55
  • 56. Incremental Fitting Previously wrote: Now write: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 56
  • 57. Incremental Fitting KEY IDEA: Greedily add terms one at a time. STAGE 1: Fit f0, f1, x1 STAGE 2: Fit f0, f2, x2 STAGE K: Fit f0, fk, xk Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 57
  • 58. Incremental Fitting Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 58
  • 59. Derivative It is worth considering the form of the derivative in the context of the incremental fitting procedure Actual label Predicted Label Points contribute to derivative more if they are still misclassified: the later classifiers become increasingly specialized to the difficult examples. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 59
  • 60. Boosting Incremental fitting with step functions Each step function is called a ``weak classifier`` Can`t take derivative w.r.t a so have to just use exhaustive search Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 60
  • 61. Boosting Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 61
  • 62. Branching Logistic Regression A different way to make non-linear classifiers New activation The term is a gating function. • Returns a number between 0 and • If 0 then we get one logistic regression model • If 1 then get a different logistic regression model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 62
  • 63. Branching Logistic Regression Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 63
  • 64. Logistic Classification Trees Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 64
  • 65. Structure • Logistic regression • Bayesian logistic regression • Non-linear logistic regression • Kernelization and Gaussian process classification • Incremental fitting, boosting and trees • Multi-class classification • Random classification trees • Non-probabilistic classification • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 65
  • 66. Multiclass Logistic Regression For multiclass recognition, choose distribution over w and then make the parameters of this a function of x. Softmax function maps real activations {ak} to numbers between zero and one that sum to one Parameters are vectors {fk} Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 66
  • 67. Multiclass Logistic Regression Softmax function maps activations which can take any value to parameters of categorical distribution between 0 and 1 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 67
  • 68. Multiclass Logistic Regression To learn model, maximize log likelihood No closed from solution, learn with non-linear optimization where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 68
  • 69. Structure • Logistic regression • Bayesian logistic regression • Non-linear logistic regression • Kernelization and Gaussian process classification • Incremental fitting, boosting and trees • Multi-class classification • Random classification trees • Non-probabilistic classification • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 69
  • 70. Random classification tree Key idea: • Binary tree • Randomly chosen function at each split • Choose threshold t to maximize log probability For given threshold, can compute parameters in closed form Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 70
  • 71. Random classification tree Related models: Fern: • A tree where all of the functions at a level are the same • Threshold may be same or different • Very efficient to implement Forest • Collection of trees • Average results to get more robust answer • Similar to `Bayesian’ approach – average of models with different parameters Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 71
  • 72. Structure • Logistic regression • Bayesian logistic regression • Non-linear logistic regression • Kernelization and Gaussian process classification • Incremental fitting, boosting and trees • Multi-class classification • Random classification trees • Non-probabilistic classification • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 72
  • 73. Non-probabilistic classifiers Most people use non-probabilistic classification methods such as neural networks, adaboost, support vector machines. This is largely for historical reasons Probabilistic approaches: • No serious disadvantages • Naturally produce estimates of uncertainty • Easily extensible to multi-class case • Easily related to each other Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 73
  • 74. Non-probabilistic classifiers Multi-layer perceptron (neural network) • Non-linear logistic regression with sigmoid functions • Learning known as back propagation • Transformed variable z is hidden layer Adaboost • Very closely related to logitboost • Performance very similar Support vector machines • Similar to relevance vector classification but objective fn is convex • No certainty • Not easily extended to multi-class • Produces solutions that are less sparse • More restrictions on kernel function Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 74
  • 75. Structure • Logistic regression • Bayesian logistic regression • Non-linear logistic regression • Kernelization and Gaussian process classification • Incremental fitting, boosting and trees • Multi-class classification • Random classification trees • Non-probabilistic classification • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 75
  • 76. Gender Classification Incremental logistic regression 300 arc tan basis functions: Results: 87.5% (humans=95%) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 76
  • 77. Fast Face Detection (Viola and Jones 2001) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 77
  • 78. Computing Haar Features Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 78
  • 79. Pedestrian Detection Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 79
  • 80. Semantic segmentation Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 80
  • 81. Recovering surface layout Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 81
  • 82. Recovering body pose Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 82