SlideShare une entreprise Scribd logo
1  sur  41
Télécharger pour lire hors ligne
classification edition
Machine Learning in 5 Minutes
Brian Lange
hi, i’m a
data scientist
classification
algorithms
popular examples
-spam filters
-the Sorting Hat
things to know
- you need data labeled with the correct answers to
“train” these algorithms before they work
- feature = dimension = attribute of the data
- class = category = Harry Potter house
linear discriminants
“draw a line through it”
linear discriminants
“draw a line through it”
linear discriminants
“draw a line through it”
linear discriminants
“draw a line through it”
🎉
define what “shitty” means
6 wrong
define what “shitty” means
4 wrong
a map of shittiness
to find the least shitty line
shittiness
slope
intercept
probably don’t use these
linear discriminants:
logistic regression
“divide it with a log function”
logistic regression
“divide it with a log function”
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
+ gives you probabilities
+ the model is a formula
+ can “threshold” to make model more or less
conservative
💩💩💩💩💩💩💩💩💩💩💩
- only works with linear decision boundaries
SVMs (support vector machines)
“*advanced* draw a line through it”
- better definition of “shitty”
- lines can turn into non-linear
shapes if you transform your
data
💩
💩
“the kernel trick”
🎉
woooooooooooo
🎉🎉
SVMs (support vector machines)
“*advanced* draw a line through it”
SVMs (support vector machines)
“*advanced* draw a line through it”
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
works well on a lot of different shapes of data
thanks to the kernel trick
💩💩💩💩💩💩💩💩💩💩💩
not super easy to explain to people
can only kinda do probabilities
KNN (k-nearest neighbors)
“what do similar cases look like?”
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=1
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=2
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=1
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=2
KNN (k-nearest neighbors)
“what do similar cases look like?”
k=3
KNN (k-nearest neighbors)
“what do similar cases look like?”
KNN (k-nearest neighbors)
“what do similar cases look like?”
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
+ no training, adding new data is easy
+ you get to define “distance” 

💩💩💩💩💩💩💩💩💩💩💩
- can be outlier-sensitive
- you have to define “distance”
decision tree learners
make a flow chart of it
decision tree learners
make a flow chart of it
x < 3?
yes no
3
decision tree learners
make a flow chart of it
x < 3?
yes no
y < 4?
yes no
3
4
decision tree learners
make a flow chart of it
x < 3?
yes no
y < 4?
yes no
x < 5?
yes no
3 5
4
decision tree learners
make a flow chart of it
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
+ fit all kinds of arbitrary shapes
+ output is a clear set of
conditionals

💩💩💩💩💩💩💩💩💩💩💩
- extremely prone to overfitting
- have to rebuild when you get new
data
- no probability estimates
ensemble models
make a bunch of models and combine them
ensemble models
make a bunch of models and combine them
🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉🎉
- don’t overfit as much as their component parts
- Generally don’t require much parameter tweaking
- If data doesn’t change very often, you can make
them semi-online by just adding new trees
- Can provide probabilities
💩💩💩💩💩💩💩💩💩💩💩
- Slower than their component parts (though if
those are fast, it doesn’t matter)

Contenu connexe

Tendances

Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CART
Xueping Peng
 

Tendances (20)

Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Decision Tree - C4.5&CART
Decision Tree - C4.5&CARTDecision Tree - C4.5&CART
Decision Tree - C4.5&CART
 
Knn Algorithm presentation
Knn Algorithm presentationKnn Algorithm presentation
Knn Algorithm presentation
 
Supervised Machine Learning in R
Supervised  Machine Learning  in RSupervised  Machine Learning  in R
Supervised Machine Learning in R
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Feature enginnering and selection
Feature enginnering and selectionFeature enginnering and selection
Feature enginnering and selection
 
Data science
Data scienceData science
Data science
 
NLP
NLPNLP
NLP
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Design and Analysis of Algorithms
Design and Analysis of AlgorithmsDesign and Analysis of Algorithms
Design and Analysis of Algorithms
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 
K nearest neighbor
K nearest neighborK nearest neighbor
K nearest neighbor
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Genetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial IntelligenceGenetic Algorithms - Artificial Intelligence
Genetic Algorithms - Artificial Intelligence
 
Big Data - Analytics with R
Big Data - Analytics with RBig Data - Analytics with R
Big Data - Analytics with R
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
 
Handling Missing Values for Machine Learning.pptx
Handling Missing Values for Machine Learning.pptxHandling Missing Values for Machine Learning.pptx
Handling Missing Values for Machine Learning.pptx
 

Similaire à Machine Learning in 5 Minutes— Classification

Efficient Simplification: The (im)possibilities
Efficient Simplification: The (im)possibilitiesEfficient Simplification: The (im)possibilities
Efficient Simplification: The (im)possibilities
Neeldhara Misra
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptx
Shree Shree
 

Similaire à Machine Learning in 5 Minutes— Classification (12)

It's Not Magic - Explaining classification algorithms
It's Not Magic - Explaining classification algorithmsIt's Not Magic - Explaining classification algorithms
It's Not Magic - Explaining classification algorithms
 
Mathematical reasoning
Mathematical reasoningMathematical reasoning
Mathematical reasoning
 
A Scala Corrections Library
A Scala Corrections LibraryA Scala Corrections Library
A Scala Corrections Library
 
Barga Data Science lecture 7
Barga Data Science lecture 7Barga Data Science lecture 7
Barga Data Science lecture 7
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
 
Efficient Simplification: The (im)possibilities
Efficient Simplification: The (im)possibilitiesEfficient Simplification: The (im)possibilities
Efficient Simplification: The (im)possibilities
 
Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02Geneticalgorithms 100403002207-phpapp02
Geneticalgorithms 100403002207-phpapp02
 
Declarative Syntax Definition - Pretty Printing
Declarative Syntax Definition - Pretty PrintingDeclarative Syntax Definition - Pretty Printing
Declarative Syntax Definition - Pretty Printing
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptx
 
An Introduction To Python - Variables, Math
An Introduction To Python - Variables, MathAn Introduction To Python - Variables, Math
An Introduction To Python - Variables, Math
 
DutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time SeriesDutchMLSchool. Logistic Regression, Deepnets, Time Series
DutchMLSchool. Logistic Regression, Deepnets, Time Series
 
What's "For Free" on Craigslist?
What's "For Free" on Craigslist? What's "For Free" on Craigslist?
What's "For Free" on Craigslist?
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

Machine Learning in 5 Minutes— Classification