SlideShare a Scribd company logo
1 of 25
Why Movie Reviews?
• Natural Language Processing is hot
• There are real world use cases
• It plays to my domain knowledge outside
of data science
Natural Language Processing Use Cases
What Are
People Saying
About Me?
Who Are
These
People?
Sentiment Analysis
What Are
People Saying
About Me?
Good Bad
Customer Segmentation
Who Are
These
People?
Animated Action Comedy Drama
Family Fantasy Horror Musical
Mystery Romance Sci-Fi Thriller
War Western
Sentiment Analysis Testing
Data
• 100,000 movie reviews from IMDB
• Training set of 12,500 positive
reviews (7-10 stars)
• Training set of 12,500 negative
reviews (<5 stars)
• 30 or fewer reviews per movie
Methods
• Bag of words (Sklearn TF/IDF)
• Word2Vec (Gensim)
• Doc2Vec (Gensim)
• Pattern
• Indico (by sentence)
• Indico (by document)
• Indico (High Quality Sentiment)
Speed
• How long does it take to
train?
◦ Preparing text
◦ Training machine learning
model
◦ Some models are pre-trained
• How quickly does it analyze?
◦ Preparing text
◦ Running text through the
trained model
How well does it do?
• Accuracy
◦ Did we correctly name
positive sentiment as
positive?
◦ Did we correctly name
negative sentiment as
negative?
◦ Better for even class
distribution
• F1 = (Precision + Recall) / 2
◦ Precision = percent of things
we called positive that were
actually positive
◦ Recall = percent of things that
were actually positive that we
called positive
◦ Better for uneven class
distribution
Bag Of Words (Sklearn TFIDF)
• Simple algorithm
• Fast to train (10 minutes)
• Fast to apply
• 85.3% accuracy
• 85.3% F1
Word2Vec (Gensim)
• More complex algorithm
• Computationally intensive
• Better results with larger
training sets, multiple epochs
• Slow to train (2 hours)
• Slow to apply
• 81.9% accuracy
• 82.2% F1
Doc2Vec (Gensim)
• More complex algorithm
• Computationally intensive
• Better results with larger
training sets, multiple epochs
• Slow to train (4 hours)
• Slow to apply
• 82.8% accuracy
• 82.8% F1
• Distributed bag of words
• (other models 70% and 82%
accuracy rates)
Pattern (built-in)
• Simple algorithm
• Part of the Pattern module
• No training required
• Fast to apply
• 76.4% accuracy
• 76.9% F1
• (lowest scores)
Indico (by sentence)
• More complex algorithm
• API calls to proprietary
system
• No training required
• Fast to apply
• 89.1% accuracy
• 88.9% F1
Indico (by document)
• Simple algorithm
• API calls to proprietary
system
• No training required
• Fast to apply
• 90.1% accuracy
• 90.0% F1
Indico (High Quality Sentiment)
• Simple algorithm
• API calls to proprietary
system
• No training required
• Slow to apply
• 93.2% accuracy
• 93.2% F1
• (highest scores)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Indico Sent_HQ
Indico by doc.
Doc2Vec
Pattern
Indico by sent.
Word2Vec
Bag of Words
Accuracy F1
Comparison of Sentiment Prediction
Customer Segmentation
Who Are
These
People?
Animated Action Comedy Drama
Family Fantasy Horror Musical
Mystery Romance Sci-Fi Thriller
War Western
Customer Segmentation Testing
Data
• 129,809 movie reviews from IMDB
• 3,323 different movies
• 510 different combinations of genre
• 40 or fewer reviews per movie
Methodology
◦ Transform each review into an
Indico document vector
◦ Test success of different
document criteria
◦ Test success of different models
◦ Optimize best model
What Is Random Chance?
• 510 different genre combinations
• Heavily weighted to negative
• F1 random chance < 20%
Animated Action Comedy Drama
Family Fantasy Horror Musical
Mystery Romance Sci-Fi Thriller
War Western
More Reviews or More Words?
Number of Reviews Word Length F1 Score
129,809 All Reviews .54
72,691 200+ .56
19,914 500+ .57
Conclusion: Longer reviews work better
Optimizing Models
Model F1 Score
Tuned Random Forest .57
Initial Logistic Regression, Initial Linear SVC .62
Tuned Logistic Regression, Tuned Linear SVC .63
Initial Gradient Boost .63
Tuned Gradient Boost .67
Conclusion: Choosing the right model matters more than tuning
More From Customer Segmentation
Genre
Review 1
Review 2
Review 3
Feature
1
Feature
2
Feature
3
Genre
Some Customer Segments Overlap
Animated Family
Both: 1297 reviews Animated: 395 reviews Family: 1090 reviews
Segments Care About Different Things
Horror War
Machine Learning From Movie Reviews
• See the complete set of word clouds at:
◦ Github/JenniferDunne
• Contact:
◦ Jennifer.dunne.co@gmail.com
◦ Linkedin/jenniferdunneco

More Related Content

Viewers also liked

Viewers also liked (11)

сагынтай еркебулан+создание сайтов+клиенты
сагынтай еркебулан+создание сайтов+клиентысагынтай еркебулан+создание сайтов+клиенты
сагынтай еркебулан+создание сайтов+клиенты
 
Casa R128
Casa R128Casa R128
Casa R128
 
Assignment 6
Assignment 6Assignment 6
Assignment 6
 
CONOCIMIENTO AUTONOMO.
CONOCIMIENTO AUTONOMO.CONOCIMIENTO AUTONOMO.
CONOCIMIENTO AUTONOMO.
 
Proyecto de aula periodo 1 nuevo noveno
Proyecto de aula periodo 1 nuevo novenoProyecto de aula periodo 1 nuevo noveno
Proyecto de aula periodo 1 nuevo noveno
 
Powerpoint desiree
Powerpoint desireePowerpoint desiree
Powerpoint desiree
 
ZalacznikA
ZalacznikAZalacznikA
ZalacznikA
 
Foto
FotoFoto
Foto
 
Ergonomia
ErgonomiaErgonomia
Ergonomia
 
Birdie MDaemon Migrator
Birdie MDaemon MigratorBirdie MDaemon Migrator
Birdie MDaemon Migrator
 
Diapositivas en linea
Diapositivas en lineaDiapositivas en linea
Diapositivas en linea
 

Similar to Machine Learning From Movie Reviews - Long Form

Custom spellchecker for SOLR
Custom spellchecker for SOLRCustom spellchecker for SOLR
Custom spellchecker for SOLRMurthy Remella
 
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive SearchTHAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive SearchBrian McKeiver
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment AnalysisSagar Ahire
 
Story generation-Sarah Saneei
Story generation-Sarah SaneeiStory generation-Sarah Saneei
Story generation-Sarah SaneeiSRah Sanei
 
Finding a good development partner
Finding a good development partnerFinding a good development partner
Finding a good development partnerKevin Poorman
 
Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis Hady Elsahar
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmVaibhav Varshney
 
Capstone Project: Master's of Science in Data Science
Capstone Project: Master's of Science in Data Science Capstone Project: Master's of Science in Data Science
Capstone Project: Master's of Science in Data Science Silvia Qu
 
Hierarchical Transformer for Early Detection of Alzheimer’s Disease
Hierarchical Transformer for Early Detection of Alzheimer’s DiseaseHierarchical Transformer for Early Detection of Alzheimer’s Disease
Hierarchical Transformer for Early Detection of Alzheimer’s DiseaseJinho Choi
 
Adopting tdd in the workplace
Adopting tdd in the workplaceAdopting tdd in the workplace
Adopting tdd in the workplaceDonny Wals
 
Adopting tdd in the workplace
Adopting tdd in the workplaceAdopting tdd in the workplace
Adopting tdd in the workplaceDonny Wals
 
Testing a movingtarget_quest_dynatrace
Testing a movingtarget_quest_dynatraceTesting a movingtarget_quest_dynatrace
Testing a movingtarget_quest_dynatracePeter Varhol
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?Michaela Greiler
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialAlyona Medelyan
 
FriendsQA: Open-domain Question Answering on TV Show Transcripts
FriendsQA: Open-domain Question Answering on TV Show TranscriptsFriendsQA: Open-domain Question Answering on TV Show Transcripts
FriendsQA: Open-domain Question Answering on TV Show TranscriptsJinho Choi
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?NAVER Engineering
 
TDD - for people who don't need it
TDD - for people who don't need itTDD - for people who don't need it
TDD - for people who don't need itChoon Keat Chew
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNNŞeyda Hatipoğlu
 
How to Use Artificial Intelligence by Microsoft Product Manager
 How to Use Artificial Intelligence by Microsoft Product Manager How to Use Artificial Intelligence by Microsoft Product Manager
How to Use Artificial Intelligence by Microsoft Product ManagerProduct School
 

Similar to Machine Learning From Movie Reviews - Long Form (20)

Custom spellchecker for SOLR
Custom spellchecker for SOLRCustom spellchecker for SOLR
Custom spellchecker for SOLR
 
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive SearchTHAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Story generation-Sarah Saneei
Story generation-Sarah SaneeiStory generation-Sarah Saneei
Story generation-Sarah Saneei
 
Finding a good development partner
Finding a good development partnerFinding a good development partner
Finding a good development partner
 
Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis Building Large Arabic Multi-Domain Resources for Sentiment Analysis
Building Large Arabic Multi-Domain Resources for Sentiment Analysis
 
Recommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic AlgorithmRecommendation engine Using Genetic Algorithm
Recommendation engine Using Genetic Algorithm
 
Capstone Project: Master's of Science in Data Science
Capstone Project: Master's of Science in Data Science Capstone Project: Master's of Science in Data Science
Capstone Project: Master's of Science in Data Science
 
Hierarchical Transformer for Early Detection of Alzheimer’s Disease
Hierarchical Transformer for Early Detection of Alzheimer’s DiseaseHierarchical Transformer for Early Detection of Alzheimer’s Disease
Hierarchical Transformer for Early Detection of Alzheimer’s Disease
 
Adopting tdd in the workplace
Adopting tdd in the workplaceAdopting tdd in the workplace
Adopting tdd in the workplace
 
Adopting tdd in the workplace
Adopting tdd in the workplaceAdopting tdd in the workplace
Adopting tdd in the workplace
 
Testing a movingtarget_quest_dynatrace
Testing a movingtarget_quest_dynatraceTesting a movingtarget_quest_dynatrace
Testing a movingtarget_quest_dynatrace
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
 
FriendsQA: Open-domain Question Answering on TV Show Transcripts
FriendsQA: Open-domain Question Answering on TV Show TranscriptsFriendsQA: Open-domain Question Answering on TV Show Transcripts
FriendsQA: Open-domain Question Answering on TV Show Transcripts
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
TDD - for people who don't need it
TDD - for people who don't need itTDD - for people who don't need it
TDD - for people who don't need it
 
Unit Testing Your Application
Unit Testing Your ApplicationUnit Testing Your Application
Unit Testing Your Application
 
Collaborative Filtering using KNN
Collaborative Filtering using KNNCollaborative Filtering using KNN
Collaborative Filtering using KNN
 
How to Use Artificial Intelligence by Microsoft Product Manager
 How to Use Artificial Intelligence by Microsoft Product Manager How to Use Artificial Intelligence by Microsoft Product Manager
How to Use Artificial Intelligence by Microsoft Product Manager
 

Machine Learning From Movie Reviews - Long Form

  • 1.
  • 2. Why Movie Reviews? • Natural Language Processing is hot • There are real world use cases • It plays to my domain knowledge outside of data science
  • 3. Natural Language Processing Use Cases What Are People Saying About Me? Who Are These People?
  • 4. Sentiment Analysis What Are People Saying About Me? Good Bad
  • 5. Customer Segmentation Who Are These People? Animated Action Comedy Drama Family Fantasy Horror Musical Mystery Romance Sci-Fi Thriller War Western
  • 6. Sentiment Analysis Testing Data • 100,000 movie reviews from IMDB • Training set of 12,500 positive reviews (7-10 stars) • Training set of 12,500 negative reviews (<5 stars) • 30 or fewer reviews per movie Methods • Bag of words (Sklearn TF/IDF) • Word2Vec (Gensim) • Doc2Vec (Gensim) • Pattern • Indico (by sentence) • Indico (by document) • Indico (High Quality Sentiment)
  • 7. Speed • How long does it take to train? ◦ Preparing text ◦ Training machine learning model ◦ Some models are pre-trained • How quickly does it analyze? ◦ Preparing text ◦ Running text through the trained model
  • 8. How well does it do? • Accuracy ◦ Did we correctly name positive sentiment as positive? ◦ Did we correctly name negative sentiment as negative? ◦ Better for even class distribution • F1 = (Precision + Recall) / 2 ◦ Precision = percent of things we called positive that were actually positive ◦ Recall = percent of things that were actually positive that we called positive ◦ Better for uneven class distribution
  • 9. Bag Of Words (Sklearn TFIDF) • Simple algorithm • Fast to train (10 minutes) • Fast to apply • 85.3% accuracy • 85.3% F1
  • 10. Word2Vec (Gensim) • More complex algorithm • Computationally intensive • Better results with larger training sets, multiple epochs • Slow to train (2 hours) • Slow to apply • 81.9% accuracy • 82.2% F1
  • 11. Doc2Vec (Gensim) • More complex algorithm • Computationally intensive • Better results with larger training sets, multiple epochs • Slow to train (4 hours) • Slow to apply • 82.8% accuracy • 82.8% F1 • Distributed bag of words • (other models 70% and 82% accuracy rates)
  • 12. Pattern (built-in) • Simple algorithm • Part of the Pattern module • No training required • Fast to apply • 76.4% accuracy • 76.9% F1 • (lowest scores)
  • 13. Indico (by sentence) • More complex algorithm • API calls to proprietary system • No training required • Fast to apply • 89.1% accuracy • 88.9% F1
  • 14. Indico (by document) • Simple algorithm • API calls to proprietary system • No training required • Fast to apply • 90.1% accuracy • 90.0% F1
  • 15. Indico (High Quality Sentiment) • Simple algorithm • API calls to proprietary system • No training required • Slow to apply • 93.2% accuracy • 93.2% F1 • (highest scores)
  • 16. 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Indico Sent_HQ Indico by doc. Doc2Vec Pattern Indico by sent. Word2Vec Bag of Words Accuracy F1 Comparison of Sentiment Prediction
  • 17. Customer Segmentation Who Are These People? Animated Action Comedy Drama Family Fantasy Horror Musical Mystery Romance Sci-Fi Thriller War Western
  • 18. Customer Segmentation Testing Data • 129,809 movie reviews from IMDB • 3,323 different movies • 510 different combinations of genre • 40 or fewer reviews per movie Methodology ◦ Transform each review into an Indico document vector ◦ Test success of different document criteria ◦ Test success of different models ◦ Optimize best model
  • 19. What Is Random Chance? • 510 different genre combinations • Heavily weighted to negative • F1 random chance < 20% Animated Action Comedy Drama Family Fantasy Horror Musical Mystery Romance Sci-Fi Thriller War Western
  • 20. More Reviews or More Words? Number of Reviews Word Length F1 Score 129,809 All Reviews .54 72,691 200+ .56 19,914 500+ .57 Conclusion: Longer reviews work better
  • 21. Optimizing Models Model F1 Score Tuned Random Forest .57 Initial Logistic Regression, Initial Linear SVC .62 Tuned Logistic Regression, Tuned Linear SVC .63 Initial Gradient Boost .63 Tuned Gradient Boost .67 Conclusion: Choosing the right model matters more than tuning
  • 22. More From Customer Segmentation Genre Review 1 Review 2 Review 3 Feature 1 Feature 2 Feature 3 Genre
  • 23. Some Customer Segments Overlap Animated Family Both: 1297 reviews Animated: 395 reviews Family: 1090 reviews
  • 24. Segments Care About Different Things Horror War
  • 25. Machine Learning From Movie Reviews • See the complete set of word clouds at: ◦ Github/JenniferDunne • Contact: ◦ Jennifer.dunne.co@gmail.com ◦ Linkedin/jenniferdunneco