SlideShare une entreprise Scribd logo
1  sur  22
Binary class and Multi
Class Strategies for
Machine learning
By Vaibhav Arora
Two popular Approaches to algorithm design
contd..
Approach 1 the benefit is that we have much better and scalable algorithms we may come up with
new, elegant, learning algorithms and contribute to basic research in machine learning.With
sufficient time at hand this approach should be followed.
Approach 2 Will often get your algorithm working more quickly[3].it has faster time to
market.This approach really gives us a timely feedback and helps us in utilizing our resources
better.
What can go wrong?
1) High bias(underfit)
2) high variance(overfit)
How to detect it?
There are a no of ways we can do it but i will mention two:-
firstly, If you have a 2d or 3d dataset then you can visualize
the decision boundary and try to figure out what is possibly
wrong with your implementation.
ie if it overly fits the training set but fails on the
validation set then there is an overfitting problem (High
variance).Or if it fails to fit both the training or validation
set then there may be a high bias problem (underfitting).
Learning Curves
Secondly, In case you have more than three dimensions we can’t
visualize them so we use learning curves instead to see what
problem we have.
Def:-
A learning curve is a plot of the training and validation error
as a function of the number of training examples.These are meant
to give us important ideas about the problems associated with
our model.
LEARNING CURVES Procedure
1) let Xtr denote the training examples and Ytr denote the training targets and
let the no of training examples be n.
2) Then do
for i=1:n
train the learning algorithm on the training subset ie (Xtr(1:i),Ytr(1:i))
pairs
calculate train error
train_error(i)=Training set error(on Xtr(1:i),Ytr(1:i))
calculate validation error
valid_error(i)=validation set error(on complete validation set)
note that if you are using regularization then set regularization parameter to 0
while calculating the train_error and valid_error.
3) plot them
Learning Curves (typical plots)
contd..
What we want ?
How can you correct it?
● To Fix high variance
o Get more training examples.
o Get a smaller set of features.
o Tweak algo parameters
● To Fix high Bias
o Get a larger set of features.
o use polynomial features
o Tweak algo parameters
Multi class classification strategies
whenever we have a multi class classification problem there are actually many
strategies that we can follow
1) Use a multi class classifier ie a NNet etc
1) Use an no of binary classifiers to solve the problem at hand
we can combine a no of binary classifiers using these strategies:-
a)one vs all
b)one vs one
c)ECOC
There are advantages and disadvantages of both approaches (you can experiment
with both and try to find out!!) but for the purpose of this presentation we are
only going to discuss the latter.
one vs all
In a one vs all strategy we would divide the data sets such that a hypothesis
separates one class label from all of the rest.
one vs one
In a one vs one strategy we would divide the data sets into pairs (like shown
below) and then do classification.Note that one hypothesis separates only two
classes irrespective of the other classes.
How to select the class of a test
example
Now that we have trained a number of classifiers in one vs one
configuration we may choose a class for a training example as
follows(note that this can be done for both the one vs one or
one vs all, for the examples i have considered there are three
hypothesis for both the cases):
a) Feed the training sample to all the three hypothesis
(classifiers)
b) Select the class(label) for which the hypothesis
function is maximum.
Error correcting output codes
In an ECOC strategy we try to build classifiers which target
classification between different class subsets ie divide the
main classification problem into sub-classification tasks
How do we do that?
1) first use a no of binary classifiers,we make a matrix as
shown(three classifiers C1,C2,C3 and three labels L1,L2 and
L3) ,here one contains positive class and zero contains
negative class.(it is also possible to exclude certain
classes in which case you can use 1,-1,and 0 ,0 indicating
the removed class)
contd..
2) Next when we have a new example we feed it to all the
classifiers and the resultant output can then be compared with
the labels (in terms of a distance measure say hamming[1]) and
the label with the minimum hamming distance is assigned to the
training example(if more than one rows are equidistant then
assign arbitrarily).
3)Eg suppose we have a training example t1 and when we feed it
to the classifiers we get output 001 now its difference with
label 1 is 2, with label 2 is 3 and with label 3 is 1 therefore
label 1 will be assigned to the training example.
contd..
4) something that we have to keep in mind :-
if n is the no of classifiers and m is the no of labels then
n>logm (base 2)
5) there are also a no of ways by which we select the coding
matrix(the classifier label matrix) one can be through the use
of generator matrix.other possible strategies include learning
from data(via genetic algorithms) .For more info check
references.
Personal experience
ok this is all based on personal experience so you may choose to agree or not
1) When working with svms ECOC one vs one seems to work better so try that first
when using svms.
2) With logistic regression try one vs all approach first.
3) When you have a large no of training examples choose to fit logistic
regression as best as possible instead of a support vector machines with
linear kernel because they may work well but you will have a lot of support
vectors (storage).
So with reasonable compromise on accuracy you can do with a lot less
storage.
4) When programming try to implement all these strategies by implementing the
ECOC coding matrix and then you can choose to select the label either by maximum
hypothesis or on the basis of hamming distance.
References
1) https://en.wikipedia.org/wiki/Hamming_distance
1) for detail on learning curves and classification strategies prof Andrew ng’s
machine learning course coursera.
1) Prof Andrew ng course slides at
http://cs229.stanford.edu/materials/ML-advice.pdf
1) for ecoc classification a nice paper ,”Error-Correcting Output Coding for
Text Classification”,by Adam Berger, School of Computer Science Carnegie
Mellon University.can be found at
www.cs.cmu.edu/~aberger/pdf/ecoc.pdf
Thank You..
You can contact us at info@paxcel.net
Your feedback is appreciated.

Contenu connexe

Tendances

daa-unit-3-greedy method
daa-unit-3-greedy methoddaa-unit-3-greedy method
daa-unit-3-greedy method
hodcsencet
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionality
Nikhil Sharma
 

Tendances (20)

Types of Machine Learning
Types of Machine LearningTypes of Machine Learning
Types of Machine Learning
 
Feature selection
Feature selectionFeature selection
Feature selection
 
L2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms IL2. Evaluating Machine Learning Algorithms I
L2. Evaluating Machine Learning Algorithms I
 
Back face detection
Back face detectionBack face detection
Back face detection
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 
Applications of Mealy & Moore Machine
Applications of  Mealy  & Moore Machine Applications of  Mealy  & Moore Machine
Applications of Mealy & Moore Machine
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
daa-unit-3-greedy method
daa-unit-3-greedy methoddaa-unit-3-greedy method
daa-unit-3-greedy method
 
Methods of Optimization in Machine Learning
Methods of Optimization in Machine LearningMethods of Optimization in Machine Learning
Methods of Optimization in Machine Learning
 
Minmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slidesMinmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slides
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Back propagation
Back propagationBack propagation
Back propagation
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Ensemble learning Techniques
Ensemble learning TechniquesEnsemble learning Techniques
Ensemble learning Techniques
 
Curse of dimensionality
Curse of dimensionalityCurse of dimensionality
Curse of dimensionality
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and Evaluation
 
Machine Learning project presentation
Machine Learning project presentationMachine Learning project presentation
Machine Learning project presentation
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 

En vedette

En vedette (6)

Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?
 
Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...
 
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
 
Document Classification In PHP
Document Classification In PHPDocument Classification In PHP
Document Classification In PHP
 
Winnow vs perceptron
Winnow vs perceptronWinnow vs perceptron
Winnow vs perceptron
 
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
 

Similaire à Binary Class and Multi Class Strategies for Machine Learning

Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)
NYversity
 
Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...
Simplilearn
 

Similaire à Binary Class and Multi Class Strategies for Machine Learning (20)

INTRODUCTION TO BOOSTING.ppt
INTRODUCTION TO BOOSTING.pptINTRODUCTION TO BOOSTING.ppt
INTRODUCTION TO BOOSTING.ppt
 
Machine learning (5)
Machine learning (5)Machine learning (5)
Machine learning (5)
 
Essentials of machine learning algorithms
Essentials of machine learning algorithmsEssentials of machine learning algorithms
Essentials of machine learning algorithms
 
Epsrcws08 campbell kbm_01
Epsrcws08 campbell kbm_01Epsrcws08 campbell kbm_01
Epsrcws08 campbell kbm_01
 
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques  Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
Methodological Study Of Opinion Mining And Sentiment Analysis Techniques
 
Understanding Mahout classification documentation
Understanding Mahout  classification documentationUnderstanding Mahout  classification documentation
Understanding Mahout classification documentation
 
Methodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniquesMethodological study of opinion mining and sentiment analysis techniques
Methodological study of opinion mining and sentiment analysis techniques
 
Machine learning interview questions and answers
Machine learning interview questions and answersMachine learning interview questions and answers
Machine learning interview questions and answers
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning Algorithms
 
Top 50 ML Ques & Ans.pdf
Top 50 ML Ques & Ans.pdfTop 50 ML Ques & Ans.pdf
Top 50 ML Ques & Ans.pdf
 
Intro to supervised learning.pptx
Intro to supervised learning.pptxIntro to supervised learning.pptx
Intro to supervised learning.pptx
 
Adapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model SelectionAdapted Branch-and-Bound Algorithm Using SVM With Model Selection
Adapted Branch-and-Bound Algorithm Using SVM With Model Selection
 
Coding and testing In Software Engineering
Coding and testing In Software EngineeringCoding and testing In Software Engineering
Coding and testing In Software Engineering
 
Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...
 
Unit V.pdf
Unit V.pdfUnit V.pdf
Unit V.pdf
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOX
 
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
 
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...
 
detailed Presentation on supervised learning
 detailed Presentation on supervised learning detailed Presentation on supervised learning
detailed Presentation on supervised learning
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
 

Plus de Paxcel Technologies (11)

Async pattern
Async patternAsync pattern
Async pattern
 
Window phone 8 introduction
Window phone 8 introductionWindow phone 8 introduction
Window phone 8 introduction
 
Ssrs 2012(powerview) installation ans configuration
Ssrs 2012(powerview) installation ans configurationSsrs 2012(powerview) installation ans configuration
Ssrs 2012(powerview) installation ans configuration
 
Paxcel Mobile development Portfolio
Paxcel Mobile development PortfolioPaxcel Mobile development Portfolio
Paxcel Mobile development Portfolio
 
Sequence diagrams in UML
Sequence diagrams in UMLSequence diagrams in UML
Sequence diagrams in UML
 
Introduction to UML
Introduction to UMLIntroduction to UML
Introduction to UML
 
Risk Oriented Testing of Web-Based Applications
Risk Oriented Testing of Web-Based ApplicationsRisk Oriented Testing of Web-Based Applications
Risk Oriented Testing of Web-Based Applications
 
Knockout.js explained
Knockout.js explainedKnockout.js explained
Knockout.js explained
 
All about Contactless payments
All about Contactless paymentsAll about Contactless payments
All about Contactless payments
 
Html5 deciphered - designing concepts part 1
Html5 deciphered - designing concepts part 1Html5 deciphered - designing concepts part 1
Html5 deciphered - designing concepts part 1
 
Paxcel Snapshot
Paxcel SnapshotPaxcel Snapshot
Paxcel Snapshot
 

Dernier

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 

Dernier (20)

Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 

Binary Class and Multi Class Strategies for Machine Learning

  • 1. Binary class and Multi Class Strategies for Machine learning By Vaibhav Arora
  • 2. Two popular Approaches to algorithm design
  • 3. contd.. Approach 1 the benefit is that we have much better and scalable algorithms we may come up with new, elegant, learning algorithms and contribute to basic research in machine learning.With sufficient time at hand this approach should be followed. Approach 2 Will often get your algorithm working more quickly[3].it has faster time to market.This approach really gives us a timely feedback and helps us in utilizing our resources better.
  • 4. What can go wrong? 1) High bias(underfit) 2) high variance(overfit)
  • 5. How to detect it? There are a no of ways we can do it but i will mention two:- firstly, If you have a 2d or 3d dataset then you can visualize the decision boundary and try to figure out what is possibly wrong with your implementation. ie if it overly fits the training set but fails on the validation set then there is an overfitting problem (High variance).Or if it fails to fit both the training or validation set then there may be a high bias problem (underfitting).
  • 6. Learning Curves Secondly, In case you have more than three dimensions we can’t visualize them so we use learning curves instead to see what problem we have. Def:- A learning curve is a plot of the training and validation error as a function of the number of training examples.These are meant to give us important ideas about the problems associated with our model.
  • 7. LEARNING CURVES Procedure 1) let Xtr denote the training examples and Ytr denote the training targets and let the no of training examples be n. 2) Then do for i=1:n train the learning algorithm on the training subset ie (Xtr(1:i),Ytr(1:i)) pairs calculate train error train_error(i)=Training set error(on Xtr(1:i),Ytr(1:i)) calculate validation error valid_error(i)=validation set error(on complete validation set) note that if you are using regularization then set regularization parameter to 0 while calculating the train_error and valid_error. 3) plot them
  • 11. How can you correct it? ● To Fix high variance o Get more training examples. o Get a smaller set of features. o Tweak algo parameters ● To Fix high Bias o Get a larger set of features. o use polynomial features o Tweak algo parameters
  • 12. Multi class classification strategies whenever we have a multi class classification problem there are actually many strategies that we can follow 1) Use a multi class classifier ie a NNet etc 1) Use an no of binary classifiers to solve the problem at hand we can combine a no of binary classifiers using these strategies:- a)one vs all b)one vs one c)ECOC There are advantages and disadvantages of both approaches (you can experiment with both and try to find out!!) but for the purpose of this presentation we are only going to discuss the latter.
  • 13. one vs all In a one vs all strategy we would divide the data sets such that a hypothesis separates one class label from all of the rest.
  • 14. one vs one In a one vs one strategy we would divide the data sets into pairs (like shown below) and then do classification.Note that one hypothesis separates only two classes irrespective of the other classes.
  • 15. How to select the class of a test example Now that we have trained a number of classifiers in one vs one configuration we may choose a class for a training example as follows(note that this can be done for both the one vs one or one vs all, for the examples i have considered there are three hypothesis for both the cases): a) Feed the training sample to all the three hypothesis (classifiers) b) Select the class(label) for which the hypothesis function is maximum.
  • 16. Error correcting output codes In an ECOC strategy we try to build classifiers which target classification between different class subsets ie divide the main classification problem into sub-classification tasks
  • 17. How do we do that? 1) first use a no of binary classifiers,we make a matrix as shown(three classifiers C1,C2,C3 and three labels L1,L2 and L3) ,here one contains positive class and zero contains negative class.(it is also possible to exclude certain classes in which case you can use 1,-1,and 0 ,0 indicating the removed class)
  • 18. contd.. 2) Next when we have a new example we feed it to all the classifiers and the resultant output can then be compared with the labels (in terms of a distance measure say hamming[1]) and the label with the minimum hamming distance is assigned to the training example(if more than one rows are equidistant then assign arbitrarily). 3)Eg suppose we have a training example t1 and when we feed it to the classifiers we get output 001 now its difference with label 1 is 2, with label 2 is 3 and with label 3 is 1 therefore label 1 will be assigned to the training example.
  • 19. contd.. 4) something that we have to keep in mind :- if n is the no of classifiers and m is the no of labels then n>logm (base 2) 5) there are also a no of ways by which we select the coding matrix(the classifier label matrix) one can be through the use of generator matrix.other possible strategies include learning from data(via genetic algorithms) .For more info check references.
  • 20. Personal experience ok this is all based on personal experience so you may choose to agree or not 1) When working with svms ECOC one vs one seems to work better so try that first when using svms. 2) With logistic regression try one vs all approach first. 3) When you have a large no of training examples choose to fit logistic regression as best as possible instead of a support vector machines with linear kernel because they may work well but you will have a lot of support vectors (storage). So with reasonable compromise on accuracy you can do with a lot less storage. 4) When programming try to implement all these strategies by implementing the ECOC coding matrix and then you can choose to select the label either by maximum hypothesis or on the basis of hamming distance.
  • 21. References 1) https://en.wikipedia.org/wiki/Hamming_distance 1) for detail on learning curves and classification strategies prof Andrew ng’s machine learning course coursera. 1) Prof Andrew ng course slides at http://cs229.stanford.edu/materials/ML-advice.pdf 1) for ecoc classification a nice paper ,”Error-Correcting Output Coding for Text Classification”,by Adam Berger, School of Computer Science Carnegie Mellon University.can be found at www.cs.cmu.edu/~aberger/pdf/ecoc.pdf
  • 22. Thank You.. You can contact us at info@paxcel.net Your feedback is appreciated.