SlideShare une entreprise Scribd logo
1  sur  4
4BA10/CS7008 Tutorial – SVM
Darren Caulfield
2 March 2009

Support vector machines
        http://en.wikipedia.org/wiki/Support_vector_machine
A support vector machine (SVM) is a type of classifier that became popular in the
early 1990s. A classifier takes a feature vector (a vector of numbers) and assigns a
class (a label) to the vector. The number of elements in the feature vector corresponds
to its dimensionality. When a classifier is “trained” to learn the class associated with
different feature vectors (as with SVMs), we have supervised classification.

Maximum-margin hyperplane




During the training stage, SVMs find the maximum-margin hyperplane between two
classes. This is the line (in two dimensions), plane (in three dimensions) or
hyperplane (in higher dimensions) that maximises the distance to the nearest data
point. Such hyperplanes generally lead to classifiers with good generalisation ability.
They are less likely to overfit the training data, i.e. the classifier should do
approximately as well, in terms of classification accuracy, with unseen data (the “test
set”) as it does with the “training set”. Cross-validation is another technique used to
reduce the chances of overfitting.

The vectors (data points) that are closest to the hyperplane (circled in the above
image) are called the support vectors. The other points do not influence the position
of this decision boundary.

Kernel trick
It is unlikely that a dataset can be well separated by a simple line, plane or hyperplane
in its original feature space. (That would be an example of a linear classifier.) Instead,
the SVM transforms the data into a higher-dimensional feature space and finds the
maximum-margin hyperplane in that space. This is called the “kernel trick”. It only




                                            1
requires the specification of a function – the kernel – that returns the distance between
any 2 points in the hyperspace.

The most popular kernels are listed below, with the parameter names that are used by
both LIBSVM and OpenCV. Custom kernels can significantly improve classification
accuracy, however. For example, we could define a string kernel for DNA sequences.

Linear: no mapping is done, linear discrimination (or regression) is done in the
       original feature space. It is the fastest option.
       d(x,y) = x•y == (x,y)
Poly: polynomial kernel:
       d(x,y) = (gamma*(x•y)+coef0)degree
RBF: radial-basis-function kernel; a good choice in most cases:
       d(x,y) = exp(-gamma*|x-y|2)
Sigmoid: sigmoid function is used as a kernel:
       d(x,y) = tanh(gamma*(x•y)+coef0)


Soft margin SVM
Even with the kernel trick, some datasets are not perfectly separable, either because
the features do not discriminate between the classes well enough or because some
data points have been mis-labelled. “Soft margin” SVMs find hyperplanes that split
the data as cleanly as possible, while allowing some examples to remain on the wrong
side of the hyperplane.

OpenCV implementation
The Machine Learning library in OpenCV 1.0 implements several types of classifier,
including SVMs. However, very little SVM sample code is available to date. The
documentation can be found here:
       http://opencvlibrary.svn.sourceforge.net/viewvc/opencvlibrary/trunk/opencv/d
       oc/ref/opencvref_ml.htm
The functionality closely mirrors that of the more mature LIBSVM (see below).

Other classifiers to be found in OpenCV include: Bayes Classifier, k Nearest
Neighbours, Decision Trees, Boosting, Random Trees, Expectation-Maximization and
Neural Networks.

Evaluation
Classifiers often have their accuracy evaluated in terms of true positives and false
positives for a given threshold:




or by plotting true positives versus false positives while changing some threshold – a
receiver operating characteristic (ROC curve).




                                           2
The importance of features
Much of the research literature is concerned with the accuracy of various classifiers,
often benchmarked against various standard datasets. It is important to realise that the
best way to “solve” a classification problem (or at least improve the accuracy) is to
find, extract or develop better features. With discriminative features a “basic”
approach, e.g. Naïve Bayes or k Nearest Neighbour, will usually do as well as an
advanced approach. No classifier will ever be accurate with weak features.


Tutorial tasks
Download and unzip LIBSVM and the other associated files:
       https://www.cs.tcd.ie/Darren.Caulfield/vision
Further information: “Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for
support vector machines”, 2001. The software is available at
       http://www.csie.ntu.edu.tw/~cjlin/libsvm

svm-toy
Navigate to the “windows” folder and run “svm-toy.exe”. Load the data file
“fourclass_rescaled_for_app.txt”. (It is actually only a two-class dataset, adapted from
the LIBSVM dataset page.)

Here is the LIBSVM parameters guide (compare to the kernels listed above):
       -s svm_type : set type of SVM (default 0)
                  0 -- C-SVC
                  1 -- nu-SVC
                  2 -- one-class SVM
                  3 -- epsilon-SVR
                  4 -- nu-SVR
       -t kernel_type : set type of kernel function (default 2)
                  0 -- linear: u'*v
                  1 -- polynomial: (gamma*u'*v + coef0)^degree
                  2 -- radial basis function: exp(-gamma*|u-v|^2)
                  3 -- sigmoid: tanh(gamma*u'*v + coef0)
       -d degree : set degree in kernel function (default 3)
       -g gamma : set gamma in kernel function (default 1/k)
       -r coef0 : set coef0 in kernel function (default 0)
       -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
       -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
       -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
       -m cachesize : set cache memory size in MB (default 100)
       -e epsilon : set tolerance of termination criterion (default 0.001)
       -h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1)
       -b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or
       1 (default 0)
       -wi weight: set the parameter C of class i to weight*C, for C-SVC (default 1)

       The k in the -g option means the number of attributes in the input data.

       option -v randomly splits the data into n parts and calculates cross
       validation accuracy/mean squared error on them.
Click “Run” with the default parameters left unchanged and observe the classification
result.




                                                  3
Change the parameters (in the text box at the bottom right). In particular, try changing
the t, c g, d and r values. Find parameters that leave the two classes well separated.


svm-train and svm-predict
Download and unzip the “a1a” dataset (training and test sets) and put the files in the
“windows” folder of LIBSVM. Open a command prompt in that folder.

       Usage: svm-train [options] training_set_file [model_file]

       Usage: svm-predict [options] test_file model_file output_file

Run the following commands. The train a classifier (on the training set) using a RBF
kernel (default), and use it for prediction (classification) on the test set:

       svm-train.exe -c 10    a1a.txt          a1a.model

       svm-predict.exe        a1a.t            a1a.model     a1a.output

Change the –c parameter from 0.01 to 10000 (increase by a factor of 10 each time)
and study the effect.

Change the –g (gamma) parameter.

This training set is unbalanced: there are 1210 examples from one class and 395
examples from the other. Try the “–w1 weight” and “–w-1 weight” options to adjust
the penalty for misclassification.




See the following page for some 3D results:
       http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/svmtoy3d/examples/




                                           4

Contenu connexe

Tendances

Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithmszamakhan
 
Image Classification And Support Vector Machine
Image Classification And Support Vector MachineImage Classification And Support Vector Machine
Image Classification And Support Vector MachineShao-Chuan Wang
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methodsReza Ramezani
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...eSAT Journals
 
SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorialbutest
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithmsSaiful Islam
 
Fuzzy Genetic Algorithm
Fuzzy Genetic AlgorithmFuzzy Genetic Algorithm
Fuzzy Genetic AlgorithmPintu Khan
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extractionskylian
 
Activation functions
Activation functionsActivation functions
Activation functionsPRATEEK SAHU
 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithmswapnac12
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)Sharayu Patil
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineMusa Hawamdah
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine LearningPavithra Thippanaik
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)Shweta Ghate
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine LearningUpekha Vandebona
 
Support Vector Machines (SVM)
Support Vector Machines (SVM)Support Vector Machines (SVM)
Support Vector Machines (SVM)FAO
 

Tendances (20)

Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
 
Image Classification And Support Vector Machine
Image Classification And Support Vector MachineImage Classification And Support Vector Machine
Image Classification And Support Vector Machine
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorial
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
 
Fuzzy Genetic Algorithm
Fuzzy Genetic AlgorithmFuzzy Genetic Algorithm
Fuzzy Genetic Algorithm
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
Activation functions
Activation functionsActivation functions
Activation functions
 
Multilayer & Back propagation algorithm
Multilayer & Back propagation algorithmMultilayer & Back propagation algorithm
Multilayer & Back propagation algorithm
 
Adaline and Madaline.ppt
Adaline and Madaline.pptAdaline and Madaline.ppt
Adaline and Madaline.ppt
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)
 
Artificial Neural Networks for Data Mining
Artificial Neural Networks for Data MiningArtificial Neural Networks for Data Mining
Artificial Neural Networks for Data Mining
 
Feature Selection in Machine Learning
Feature Selection in Machine LearningFeature Selection in Machine Learning
Feature Selection in Machine Learning
 
Support Vector Machines (SVM)
Support Vector Machines (SVM)Support Vector Machines (SVM)
Support Vector Machines (SVM)
 

En vedette

презентацііія2222
презентацііія2222презентацііія2222
презентацііія2222guested712f1
 
Mining Regional Knowledge in Spatial Dataset
Mining Regional Knowledge in Spatial DatasetMining Regional Knowledge in Spatial Dataset
Mining Regional Knowledge in Spatial Datasetbutest
 
MGT-350 Russell.docx - Cameron School of Business - University of ...
MGT-350 Russell.docx - Cameron School of Business - University of ...MGT-350 Russell.docx - Cameron School of Business - University of ...
MGT-350 Russell.docx - Cameron School of Business - University of ...butest
 
MikroBasic
MikroBasicMikroBasic
MikroBasicbutest
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysisbutest
 
Selasa, 28 April 2015
Selasa, 28 April 2015Selasa, 28 April 2015
Selasa, 28 April 2015suarakarya
 
lec21.ppt
lec21.pptlec21.ppt
lec21.pptbutest
 

En vedette (8)

презентацііія2222
презентацііія2222презентацііія2222
презентацііія2222
 
Mining Regional Knowledge in Spatial Dataset
Mining Regional Knowledge in Spatial DatasetMining Regional Knowledge in Spatial Dataset
Mining Regional Knowledge in Spatial Dataset
 
MGT-350 Russell.docx - Cameron School of Business - University of ...
MGT-350 Russell.docx - Cameron School of Business - University of ...MGT-350 Russell.docx - Cameron School of Business - University of ...
MGT-350 Russell.docx - Cameron School of Business - University of ...
 
ppt
pptppt
ppt
 
MikroBasic
MikroBasicMikroBasic
MikroBasic
 
Machine Learning and Statistical Analysis
Machine Learning and Statistical AnalysisMachine Learning and Statistical Analysis
Machine Learning and Statistical Analysis
 
Selasa, 28 April 2015
Selasa, 28 April 2015Selasa, 28 April 2015
Selasa, 28 April 2015
 
lec21.ppt
lec21.pptlec21.ppt
lec21.ppt
 

Similaire à Tutorial - Support vector machines

Svm implementation for Health Data
Svm implementation for Health DataSvm implementation for Health Data
Svm implementation for Health DataAbhishek Agrawal
 
Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svmtaikhoan262
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Zihui Li
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revisedKrish_ver2
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171Yaxin Liu
 
Introduction to Support Vector Machines
Introduction to Support Vector MachinesIntroduction to Support Vector Machines
Introduction to Support Vector MachinesSilicon Mentor
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slidesSara Asher
 
maXbox starter67 machine learning V
maXbox starter67 machine learning VmaXbox starter67 machine learning V
maXbox starter67 machine learning VMax Kleiner
 
Overview of Chainer and Its Features
Overview of Chainer and Its FeaturesOverview of Chainer and Its Features
Overview of Chainer and Its FeaturesSeiya Tokui
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier홍배 김
 
maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learningMax Kleiner
 
Evaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsEvaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsinfopapers
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
PVS-Studio Meets Octave
PVS-Studio Meets Octave PVS-Studio Meets Octave
PVS-Studio Meets Octave PVS-Studio
 

Similaire à Tutorial - Support vector machines (20)

Svm implementation for Health Data
Svm implementation for Health DataSvm implementation for Health Data
Svm implementation for Health Data
 
Text categorization
Text categorizationText categorization
Text categorization
 
Guide
GuideGuide
Guide
 
Huong dan cu the svm
Huong dan cu the svmHuong dan cu the svm
Huong dan cu the svm
 
Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)Machine Learning Algorithms (Part 1)
Machine Learning Algorithms (Part 1)
 
2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised2.6 support vector machines and associative classifiers revised
2.6 support vector machines and associative classifiers revised
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
Guide
GuideGuide
Guide
 
Introduction to Support Vector Machines
Introduction to Support Vector MachinesIntroduction to Support Vector Machines
Introduction to Support Vector Machines
 
lecture_16.pptx
lecture_16.pptxlecture_16.pptx
lecture_16.pptx
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slides
 
maXbox starter67 machine learning V
maXbox starter67 machine learning VmaXbox starter67 machine learning V
maXbox starter67 machine learning V
 
Overview of Chainer and Its Features
Overview of Chainer and Its FeaturesOverview of Chainer and Its Features
Overview of Chainer and Its Features
 
Anomaly detection using deep one class classifier
Anomaly detection using deep one class classifierAnomaly detection using deep one class classifier
Anomaly detection using deep one class classifier
 
maxbox starter60 machine learning
maxbox starter60 machine learningmaxbox starter60 machine learning
maxbox starter60 machine learning
 
Evaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernelsEvaluation of a hybrid method for constructing multiple SVM kernels
Evaluation of a hybrid method for constructing multiple SVM kernels
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
PVS-Studio Meets Octave
PVS-Studio Meets Octave PVS-Studio Meets Octave
PVS-Studio Meets Octave
 
20 mins of Liblinear
20 mins of Liblinear20 mins of Liblinear
20 mins of Liblinear
 

Plus de butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Plus de butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Tutorial - Support vector machines

  • 1. 4BA10/CS7008 Tutorial – SVM Darren Caulfield 2 March 2009 Support vector machines http://en.wikipedia.org/wiki/Support_vector_machine A support vector machine (SVM) is a type of classifier that became popular in the early 1990s. A classifier takes a feature vector (a vector of numbers) and assigns a class (a label) to the vector. The number of elements in the feature vector corresponds to its dimensionality. When a classifier is “trained” to learn the class associated with different feature vectors (as with SVMs), we have supervised classification. Maximum-margin hyperplane During the training stage, SVMs find the maximum-margin hyperplane between two classes. This is the line (in two dimensions), plane (in three dimensions) or hyperplane (in higher dimensions) that maximises the distance to the nearest data point. Such hyperplanes generally lead to classifiers with good generalisation ability. They are less likely to overfit the training data, i.e. the classifier should do approximately as well, in terms of classification accuracy, with unseen data (the “test set”) as it does with the “training set”. Cross-validation is another technique used to reduce the chances of overfitting. The vectors (data points) that are closest to the hyperplane (circled in the above image) are called the support vectors. The other points do not influence the position of this decision boundary. Kernel trick It is unlikely that a dataset can be well separated by a simple line, plane or hyperplane in its original feature space. (That would be an example of a linear classifier.) Instead, the SVM transforms the data into a higher-dimensional feature space and finds the maximum-margin hyperplane in that space. This is called the “kernel trick”. It only 1
  • 2. requires the specification of a function – the kernel – that returns the distance between any 2 points in the hyperspace. The most popular kernels are listed below, with the parameter names that are used by both LIBSVM and OpenCV. Custom kernels can significantly improve classification accuracy, however. For example, we could define a string kernel for DNA sequences. Linear: no mapping is done, linear discrimination (or regression) is done in the original feature space. It is the fastest option. d(x,y) = x•y == (x,y) Poly: polynomial kernel: d(x,y) = (gamma*(x•y)+coef0)degree RBF: radial-basis-function kernel; a good choice in most cases: d(x,y) = exp(-gamma*|x-y|2) Sigmoid: sigmoid function is used as a kernel: d(x,y) = tanh(gamma*(x•y)+coef0) Soft margin SVM Even with the kernel trick, some datasets are not perfectly separable, either because the features do not discriminate between the classes well enough or because some data points have been mis-labelled. “Soft margin” SVMs find hyperplanes that split the data as cleanly as possible, while allowing some examples to remain on the wrong side of the hyperplane. OpenCV implementation The Machine Learning library in OpenCV 1.0 implements several types of classifier, including SVMs. However, very little SVM sample code is available to date. The documentation can be found here: http://opencvlibrary.svn.sourceforge.net/viewvc/opencvlibrary/trunk/opencv/d oc/ref/opencvref_ml.htm The functionality closely mirrors that of the more mature LIBSVM (see below). Other classifiers to be found in OpenCV include: Bayes Classifier, k Nearest Neighbours, Decision Trees, Boosting, Random Trees, Expectation-Maximization and Neural Networks. Evaluation Classifiers often have their accuracy evaluated in terms of true positives and false positives for a given threshold: or by plotting true positives versus false positives while changing some threshold – a receiver operating characteristic (ROC curve). 2
  • 3. The importance of features Much of the research literature is concerned with the accuracy of various classifiers, often benchmarked against various standard datasets. It is important to realise that the best way to “solve” a classification problem (or at least improve the accuracy) is to find, extract or develop better features. With discriminative features a “basic” approach, e.g. Naïve Bayes or k Nearest Neighbour, will usually do as well as an advanced approach. No classifier will ever be accurate with weak features. Tutorial tasks Download and unzip LIBSVM and the other associated files: https://www.cs.tcd.ie/Darren.Caulfield/vision Further information: “Chih-Chung Chang and Chih-Jen Lin, LIBSVM: a library for support vector machines”, 2001. The software is available at http://www.csie.ntu.edu.tw/~cjlin/libsvm svm-toy Navigate to the “windows” folder and run “svm-toy.exe”. Load the data file “fourclass_rescaled_for_app.txt”. (It is actually only a two-class dataset, adapted from the LIBSVM dataset page.) Here is the LIBSVM parameters guide (compare to the kernels listed above): -s svm_type : set type of SVM (default 0) 0 -- C-SVC 1 -- nu-SVC 2 -- one-class SVM 3 -- epsilon-SVR 4 -- nu-SVR -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/k) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight: set the parameter C of class i to weight*C, for C-SVC (default 1) The k in the -g option means the number of attributes in the input data. option -v randomly splits the data into n parts and calculates cross validation accuracy/mean squared error on them. Click “Run” with the default parameters left unchanged and observe the classification result. 3
  • 4. Change the parameters (in the text box at the bottom right). In particular, try changing the t, c g, d and r values. Find parameters that leave the two classes well separated. svm-train and svm-predict Download and unzip the “a1a” dataset (training and test sets) and put the files in the “windows” folder of LIBSVM. Open a command prompt in that folder. Usage: svm-train [options] training_set_file [model_file] Usage: svm-predict [options] test_file model_file output_file Run the following commands. The train a classifier (on the training set) using a RBF kernel (default), and use it for prediction (classification) on the test set: svm-train.exe -c 10 a1a.txt a1a.model svm-predict.exe a1a.t a1a.model a1a.output Change the –c parameter from 0.01 to 10000 (increase by a factor of 10 each time) and study the effect. Change the –g (gamma) parameter. This training set is unbalanced: there are 1210 examples from one class and 395 examples from the other. Try the “–w1 weight” and “–w-1 weight” options to adjust the penalty for misclassification. See the following page for some 3D results: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/svmtoy3d/examples/ 4