SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
Naïve Bayes for the
Superbowl
John Liu
Nashville Machine Learning Meetup
January 27th, 2015
NashML Goals
Create a hub for like-minded people to come together,
share knowledge and collaborate on interesting domains.
Experienced
MLers
ML
Hackers
Topic
Presentations
Collaborative
Practice
Common Platform
Platform
! IPython Notebook (Project Jupyter)
!   Java, Scala, Python (others?)
! Scikit-learn
!   PyLearn2/Theano
! iTorch
!   AWS/Mahout/Spark/Mllib?
Rev. Thomas Bayes
“An Essay towards solving a Problem in the Doctrine of
Chances” published posthumously 1763
I now send you an essay which I have
found among the papers of our
deceased friend Mr. Bayes, and which,
in my opinion, has great merit, and
well deserves to be preserved.
Experimental philosophy, you will
find, is nearly interested in the subject
of it; and on this account there seems to
be particular reason for thinking that
a communication of it to the Royal
Society cannot be improper…
Bayes’ Theorem
P(A | B)P(B) = P(A∩B) = P(B | A)P(A)
P(A | B) =
P(A∩B)
P(B)
P(A | B) =
P(B | A)P(A)
P(B)
Statistical Inference
P(A | B) =
P(B | A)P(A)
P(B)
Likelihood Prior
EvidencePosterior
(What we want to find)
Intuition Behind Bayes
A Priori Initial Belief (model)
Evidence See the Data
Likelihood How likely to see Data given Belief
A Posteriori Updated Belief after seeing Data
posterior =
prior•likelihood
evidence
Example
Blackjack Insurance Bet:
What is the probability that a dealer
with an Ace showing has Blackjack?
Example
Prior P(Dealer has Blackjack) 32/663
Evidence P(Ace showing) 1/13
Likelihood P(Ace showing|has Blackjack) 1/2
Posterior P(Blackjack|Ace showing) 16/51
16/51 = 31% or less than 1/3 of the time.
Are you a Bayesian?
You read Burton Malkiel’s book “A Random Walk down
Wall Street” and believe in the Efficient Market
Hypothesis. Your broker gives you a tip to buy Tesla. You
ignore the broker and Tesla rises 100 days in a row.
As a Bayesian, do you believe:
A)  The stock is long due for a correction
B)  It is possible for Tesla to rise another 100 days in a row
C)  You were fooled by randomness
Naïve Bayes
You observe outcome Y with some n features X1, X2, ..Xn. The
joint density can be expressed using the chain rule:
P(Y, X1, X2, ..Xn) = P(X1, X2, ..Xn|Y) P(Y)
= P(Y) P(X1,Y) P(X2|Y,X1) P(X3|Y,X1,X2)…
This is messy, but simplifies if we naively assume independence,
P(X2|Y,X1) = P(X2|Y)
P(X3|Y,X2,X1) = P(X3|Y)
P(Xn|Y,Xn…X2,X1) = P(Xn|Y)
Naïve Bayes
Assumption
Naïve Bayes Classifier
Let K classes be denoted ck. The (conditional) probability of
class ck given that we observed features x1, x2,..xn is:
A Naïve Bayes classifier simply chooses the class with
highest probability (maximum a posteriori):
P(ck | x1, x2, ..xn ) = P(ck ) P(xi | ck )
i=1
n
∏
cNB = argmax
k∈K
P(ck ) P(xi | ck )
i=1
n
∏
Gaussian Naïve Bayes
When features xi are continuous valued, typically make the
assumption they are normally distributed:
Variance σki can be independent of xi and/or ck.
P(xi | ck ) =
1
2πσki
2
e
−
1
2
xi−µki
σki
"
#
$
%
&
'
2
cNB = argmax
k∈K
P(ck ) P(xi | ck )
i=1
n
∏
Multinomial Naïve Bayes
When features xi are the number of occurrences of n
possible events (words, votes, etc…)
pki = probability of i-th event occuring in class k
xi = frequency of i-th event
The multinomial Naïve Bayes classifier becomes:
cNB = argmax
k∈K
logP(ck )+ xi •log pki
i=1
n
∑
#
$
%
&
'
(
Naïve Bayes Intuition
!   Assumes all features are independent with each other
!   Independence assumption decouples the individual
distributions for each feature
!   Decoupling can overcome curse of dimensionality
!   Performance robust to irrelevant features
!   Very fast, low storage footprint
!   Good performance with multiple equally important
features
Naïve Bayes Applications
!   Document Categorization
!   NLP
!   Email Sorting
!   Collaborative Filtering
!   Sports Prediction
!   Sentiment Analysis
Example: Doc Classification
Want to classify documents into k topics. Document d
consisting of words wi is assigned to the topic cNB:
P(ck) = topic frequency =
P(wi|ck) = word wi frequency in all topic ck docs
cNB = argmax
k∈K
P(ck ) P(wi | ck )
i
∏
Ndocs(topic=ck )
Ndocs
=
Nword=wi (topic=ck )
Nword=wi (topic=ck )
i
∑
Laplace (add-1) Smoothing
What happens with P(wi|ck) = 0 for a particular i, k?
Solution is to add 1 to numerator & denominator:
P(ck | x1, x2, ..xn ) = P(ck ) P(xi | ck )
i=1
n
∏ = 0!
P(wi | ck ) =
Nword=wi (topic=ck ) +1
Nword=wi (topic=ck ) +1( )
i
∑
NBC Application Roadmap
!   Read Dataset
!   Transform Dataset
!   Create Classifier
!   Train Classifier
!   Make Prediction
Sports Prediction
Who is favored to win the Superbowl?
Given the 2014 season game statistics for two teams, how can we
make a prediction on the outcome of the next game using a Naïve
Bayes classifier?
Sentiment Analysis
Which team is most favored in each state?
What if we analyzed tweets by sentiment and location using a
Naïve Bayes Classifier?
Starter Code
Repo with Starter Code at:
https://github.com/guard0g/NaiveBayesForSuperbowl
IPython notebook: NB4Superbowl.ipynb
Datasets: SeattleStats.csv
NewEnglandStats.csv

Contenu connexe

Similaire à Naive Bayes for the Superbowl

Presentation X-SHS - 27 oct 2015 - Topologie et perception
Presentation X-SHS - 27 oct 2015 - Topologie et perceptionPresentation X-SHS - 27 oct 2015 - Topologie et perception
Presentation X-SHS - 27 oct 2015 - Topologie et perceptionMichel Paillet
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream MiningAlbert Bifet
 
SESSION-11 PPT.pptx
SESSION-11 PPT.pptxSESSION-11 PPT.pptx
SESSION-11 PPT.pptxNaniSarath
 
aritficial intellegence
aritficial intellegencearitficial intellegence
aritficial intellegenceAnkit Sharma
 
Resources
ResourcesResources
Resourcesbutest
 
Logics of Context and Modal Type Theories
Logics of Context and Modal Type TheoriesLogics of Context and Modal Type Theories
Logics of Context and Modal Type TheoriesValeria de Paiva
 
lecture15-supervised.ppt
lecture15-supervised.pptlecture15-supervised.ppt
lecture15-supervised.pptIndra Hermawan
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataVrije Universiteit Amsterdam
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorizationmidi
 
Predicates and Quantifiers
Predicates and Quantifiers Predicates and Quantifiers
Predicates and Quantifiers Istiak Ahmed
 
Predicates and quantifiers
Predicates and quantifiersPredicates and quantifiers
Predicates and quantifiersIstiak Ahmed
 
rediscover mathematics from 0 and 1
rediscover mathematics from 0 and 1rediscover mathematics from 0 and 1
rediscover mathematics from 0 and 1narayana dash
 
Discrete Structure Lecture #5 & 6.pdf
Discrete Structure Lecture #5 & 6.pdfDiscrete Structure Lecture #5 & 6.pdf
Discrete Structure Lecture #5 & 6.pdfMuhammadUmerIhtisham
 
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...Tobias Wunner
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learningkkkc
 

Similaire à Naive Bayes for the Superbowl (20)

Lecture10 - Naïve Bayes
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
 
Presentation X-SHS - 27 oct 2015 - Topologie et perception
Presentation X-SHS - 27 oct 2015 - Topologie et perceptionPresentation X-SHS - 27 oct 2015 - Topologie et perception
Presentation X-SHS - 27 oct 2015 - Topologie et perception
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
 
Inex07
Inex07Inex07
Inex07
 
SESSION-11 PPT.pptx
SESSION-11 PPT.pptxSESSION-11 PPT.pptx
SESSION-11 PPT.pptx
 
aritficial intellegence
aritficial intellegencearitficial intellegence
aritficial intellegence
 
lect14-semantics.ppt
lect14-semantics.pptlect14-semantics.ppt
lect14-semantics.ppt
 
Resources
ResourcesResources
Resources
 
Logics of Context and Modal Type Theories
Logics of Context and Modal Type TheoriesLogics of Context and Modal Type Theories
Logics of Context and Modal Type Theories
 
Bayes 6
Bayes 6Bayes 6
Bayes 6
 
lecture15-supervised.ppt
lecture15-supervised.pptlecture15-supervised.ppt
lecture15-supervised.ppt
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
Predicates and Quantifiers
Predicates and Quantifiers Predicates and Quantifiers
Predicates and Quantifiers
 
Predicates and quantifiers
Predicates and quantifiersPredicates and quantifiers
Predicates and quantifiers
 
rediscover mathematics from 0 and 1
rediscover mathematics from 0 and 1rediscover mathematics from 0 and 1
rediscover mathematics from 0 and 1
 
Discrete Structure Lecture #5 & 6.pdf
Discrete Structure Lecture #5 & 6.pdfDiscrete Structure Lecture #5 & 6.pdf
Discrete Structure Lecture #5 & 6.pdf
 
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
 
AI applications in education, Pascal Zoleko, Flexudy
AI applications in education, Pascal Zoleko, FlexudyAI applications in education, Pascal Zoleko, Flexudy
AI applications in education, Pascal Zoleko, Flexudy
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Plus de John Liu

Kubeflow and Data Science in Kubernetes
Kubeflow and Data Science in KubernetesKubeflow and Data Science in Kubernetes
Kubeflow and Data Science in KubernetesJohn Liu
 
Artificial Intelligence As a Service
Artificial Intelligence As a ServiceArtificial Intelligence As a Service
Artificial Intelligence As a ServiceJohn Liu
 
Machine Learning with Small Data
Machine Learning with Small DataMachine Learning with Small Data
Machine Learning with Small DataJohn Liu
 
Data Analytics in Computational Law
Data Analytics in Computational LawData Analytics in Computational Law
Data Analytics in Computational LawJohn Liu
 
AI & Machine Learning: Business Transformation
AI & Machine Learning: Business TransformationAI & Machine Learning: Business Transformation
AI & Machine Learning: Business TransformationJohn Liu
 
Social Network Analysis for Healthcare
Social Network Analysis for HealthcareSocial Network Analysis for Healthcare
Social Network Analysis for HealthcareJohn Liu
 
Healthy Competition: How Adversarial Reasoning is Leading the Next Wave of In...
Healthy Competition: How Adversarial Reasoning is Leading the Next Wave of In...Healthy Competition: How Adversarial Reasoning is Leading the Next Wave of In...
Healthy Competition: How Adversarial Reasoning is Leading the Next Wave of In...John Liu
 
I2P and the Dark Web
I2P and the Dark WebI2P and the Dark Web
I2P and the Dark WebJohn Liu
 
Beyond Machine Learning: The New Generation of Learning Algorithms Coming to ...
Beyond Machine Learning: The New Generation of Learning Algorithms Coming to ...Beyond Machine Learning: The New Generation of Learning Algorithms Coming to ...
Beyond Machine Learning: The New Generation of Learning Algorithms Coming to ...John Liu
 
Behavioral Analytics for Financial Intelligence
Behavioral Analytics for Financial IntelligenceBehavioral Analytics for Financial Intelligence
Behavioral Analytics for Financial IntelligenceJohn Liu
 
Neural Networks in the Wild: Handwriting Recognition
Neural Networks in the Wild: Handwriting RecognitionNeural Networks in the Wild: Handwriting Recognition
Neural Networks in the Wild: Handwriting RecognitionJohn Liu
 
Role of Data Science in ERM @ Nashville Analytics Summit Sep 2014
Role of Data Science in ERM @ Nashville Analytics Summit Sep 2014Role of Data Science in ERM @ Nashville Analytics Summit Sep 2014
Role of Data Science in ERM @ Nashville Analytics Summit Sep 2014John Liu
 

Plus de John Liu (13)

Kubeflow and Data Science in Kubernetes
Kubeflow and Data Science in KubernetesKubeflow and Data Science in Kubernetes
Kubeflow and Data Science in Kubernetes
 
Artificial Intelligence As a Service
Artificial Intelligence As a ServiceArtificial Intelligence As a Service
Artificial Intelligence As a Service
 
Machine Learning with Small Data
Machine Learning with Small DataMachine Learning with Small Data
Machine Learning with Small Data
 
Data Analytics in Computational Law
Data Analytics in Computational LawData Analytics in Computational Law
Data Analytics in Computational Law
 
AI & Machine Learning: Business Transformation
AI & Machine Learning: Business TransformationAI & Machine Learning: Business Transformation
AI & Machine Learning: Business Transformation
 
DeepREM
DeepREMDeepREM
DeepREM
 
Social Network Analysis for Healthcare
Social Network Analysis for HealthcareSocial Network Analysis for Healthcare
Social Network Analysis for Healthcare
 
Healthy Competition: How Adversarial Reasoning is Leading the Next Wave of In...
Healthy Competition: How Adversarial Reasoning is Leading the Next Wave of In...Healthy Competition: How Adversarial Reasoning is Leading the Next Wave of In...
Healthy Competition: How Adversarial Reasoning is Leading the Next Wave of In...
 
I2P and the Dark Web
I2P and the Dark WebI2P and the Dark Web
I2P and the Dark Web
 
Beyond Machine Learning: The New Generation of Learning Algorithms Coming to ...
Beyond Machine Learning: The New Generation of Learning Algorithms Coming to ...Beyond Machine Learning: The New Generation of Learning Algorithms Coming to ...
Beyond Machine Learning: The New Generation of Learning Algorithms Coming to ...
 
Behavioral Analytics for Financial Intelligence
Behavioral Analytics for Financial IntelligenceBehavioral Analytics for Financial Intelligence
Behavioral Analytics for Financial Intelligence
 
Neural Networks in the Wild: Handwriting Recognition
Neural Networks in the Wild: Handwriting RecognitionNeural Networks in the Wild: Handwriting Recognition
Neural Networks in the Wild: Handwriting Recognition
 
Role of Data Science in ERM @ Nashville Analytics Summit Sep 2014
Role of Data Science in ERM @ Nashville Analytics Summit Sep 2014Role of Data Science in ERM @ Nashville Analytics Summit Sep 2014
Role of Data Science in ERM @ Nashville Analytics Summit Sep 2014
 

Dernier

VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 

Dernier (20)

VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 

Naive Bayes for the Superbowl

  • 1. Naïve Bayes for the Superbowl John Liu Nashville Machine Learning Meetup January 27th, 2015
  • 2. NashML Goals Create a hub for like-minded people to come together, share knowledge and collaborate on interesting domains. Experienced MLers ML Hackers Topic Presentations Collaborative Practice Common Platform
  • 3. Platform ! IPython Notebook (Project Jupyter) !   Java, Scala, Python (others?) ! Scikit-learn !   PyLearn2/Theano ! iTorch !   AWS/Mahout/Spark/Mllib?
  • 4. Rev. Thomas Bayes “An Essay towards solving a Problem in the Doctrine of Chances” published posthumously 1763 I now send you an essay which I have found among the papers of our deceased friend Mr. Bayes, and which, in my opinion, has great merit, and well deserves to be preserved. Experimental philosophy, you will find, is nearly interested in the subject of it; and on this account there seems to be particular reason for thinking that a communication of it to the Royal Society cannot be improper…
  • 5. Bayes’ Theorem P(A | B)P(B) = P(A∩B) = P(B | A)P(A) P(A | B) = P(A∩B) P(B) P(A | B) = P(B | A)P(A) P(B)
  • 6. Statistical Inference P(A | B) = P(B | A)P(A) P(B) Likelihood Prior EvidencePosterior (What we want to find)
  • 7. Intuition Behind Bayes A Priori Initial Belief (model) Evidence See the Data Likelihood How likely to see Data given Belief A Posteriori Updated Belief after seeing Data posterior = prior•likelihood evidence
  • 8. Example Blackjack Insurance Bet: What is the probability that a dealer with an Ace showing has Blackjack?
  • 9. Example Prior P(Dealer has Blackjack) 32/663 Evidence P(Ace showing) 1/13 Likelihood P(Ace showing|has Blackjack) 1/2 Posterior P(Blackjack|Ace showing) 16/51 16/51 = 31% or less than 1/3 of the time.
  • 10. Are you a Bayesian? You read Burton Malkiel’s book “A Random Walk down Wall Street” and believe in the Efficient Market Hypothesis. Your broker gives you a tip to buy Tesla. You ignore the broker and Tesla rises 100 days in a row. As a Bayesian, do you believe: A)  The stock is long due for a correction B)  It is possible for Tesla to rise another 100 days in a row C)  You were fooled by randomness
  • 11. Naïve Bayes You observe outcome Y with some n features X1, X2, ..Xn. The joint density can be expressed using the chain rule: P(Y, X1, X2, ..Xn) = P(X1, X2, ..Xn|Y) P(Y) = P(Y) P(X1,Y) P(X2|Y,X1) P(X3|Y,X1,X2)… This is messy, but simplifies if we naively assume independence, P(X2|Y,X1) = P(X2|Y) P(X3|Y,X2,X1) = P(X3|Y) P(Xn|Y,Xn…X2,X1) = P(Xn|Y) Naïve Bayes Assumption
  • 12. Naïve Bayes Classifier Let K classes be denoted ck. The (conditional) probability of class ck given that we observed features x1, x2,..xn is: A Naïve Bayes classifier simply chooses the class with highest probability (maximum a posteriori): P(ck | x1, x2, ..xn ) = P(ck ) P(xi | ck ) i=1 n ∏ cNB = argmax k∈K P(ck ) P(xi | ck ) i=1 n ∏
  • 13. Gaussian Naïve Bayes When features xi are continuous valued, typically make the assumption they are normally distributed: Variance σki can be independent of xi and/or ck. P(xi | ck ) = 1 2πσki 2 e − 1 2 xi−µki σki " # $ % & ' 2 cNB = argmax k∈K P(ck ) P(xi | ck ) i=1 n ∏
  • 14. Multinomial Naïve Bayes When features xi are the number of occurrences of n possible events (words, votes, etc…) pki = probability of i-th event occuring in class k xi = frequency of i-th event The multinomial Naïve Bayes classifier becomes: cNB = argmax k∈K logP(ck )+ xi •log pki i=1 n ∑ # $ % & ' (
  • 15. Naïve Bayes Intuition !   Assumes all features are independent with each other !   Independence assumption decouples the individual distributions for each feature !   Decoupling can overcome curse of dimensionality !   Performance robust to irrelevant features !   Very fast, low storage footprint !   Good performance with multiple equally important features
  • 16. Naïve Bayes Applications !   Document Categorization !   NLP !   Email Sorting !   Collaborative Filtering !   Sports Prediction !   Sentiment Analysis
  • 17. Example: Doc Classification Want to classify documents into k topics. Document d consisting of words wi is assigned to the topic cNB: P(ck) = topic frequency = P(wi|ck) = word wi frequency in all topic ck docs cNB = argmax k∈K P(ck ) P(wi | ck ) i ∏ Ndocs(topic=ck ) Ndocs = Nword=wi (topic=ck ) Nword=wi (topic=ck ) i ∑
  • 18. Laplace (add-1) Smoothing What happens with P(wi|ck) = 0 for a particular i, k? Solution is to add 1 to numerator & denominator: P(ck | x1, x2, ..xn ) = P(ck ) P(xi | ck ) i=1 n ∏ = 0! P(wi | ck ) = Nword=wi (topic=ck ) +1 Nword=wi (topic=ck ) +1( ) i ∑
  • 19. NBC Application Roadmap !   Read Dataset !   Transform Dataset !   Create Classifier !   Train Classifier !   Make Prediction
  • 20. Sports Prediction Who is favored to win the Superbowl? Given the 2014 season game statistics for two teams, how can we make a prediction on the outcome of the next game using a Naïve Bayes classifier?
  • 21. Sentiment Analysis Which team is most favored in each state? What if we analyzed tweets by sentiment and location using a Naïve Bayes Classifier?
  • 22. Starter Code Repo with Starter Code at: https://github.com/guard0g/NaiveBayesForSuperbowl IPython notebook: NB4Superbowl.ipynb Datasets: SeattleStats.csv NewEnglandStats.csv