SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
DATA
SCIENCE
POP UP
AUSTIN
Using LDA and Structural Topic Modeling to
Explore Trending Topics in a Call Center
Jordana Heller
Data Scientist, Mattersight
jheller
DATA
SCIENCE
POP UP
AUSTIN
#datapopupaustin
April 13, 2016
Galvanize, Austin Campus
Lightning Talk:
Using LDA and Structural Topic Modeling to
Explore Trending Topics in a Call Center
Jordana Heller @jheller
Data Science Pop-up Austin, April 13, 2016
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
What We Do
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Our goal: Topic Trends
3/31/2016 4/30/2016 5/31/2016 6/30/2016 7/31/2016
Identifying contents and prevalence of multiword topics present in conversation in an unsupervised way
Unexpected Prevalence Critical Spikes Escalating Frequency
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Our goals, continued
Manageable number of topics
Track expected and unexpected topics
Go deep: Contextualize topic usage
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Short text: Keywords, hashtags, ngrams
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Long text: Could use predetermined topics
Image credit: IBM Watson Concept Insights
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Long text: Or discover themes
Image credit: Blei, 2012, Communications of the ACM
Latent Dirichlet Allocation (LDA) (Blei et al., 2003)
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Great! How about contextualizing trends?
• Where are topics trending?
• Structural Topic Modeling (Roberts et al., 2013)
– Instead of relying on post-hoc comparisons,
includes covariates in LDA model
• Specifies priors as GLMs
• Word distribution determined by topic, covariates,
topic-covariate interaction
– Authors’ implementation: R package stm (available
via CRAN; all code on GitHub!)
Ready to talk pipeline!
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Data Collection and Preprocessing
Read
Transcripts
Add Call-level
Covariates
Preprocess
text
• Collocations
• -Stop words
• Stem/completion
• -Low freq terms
Create Term-
Document
Matrix
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Topic Model Creation
Retrieve
last topic
model
• For comparison
Create
current
topic model
• Detect number
of topics, or
specify
Create
topic labels
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Topic Model Comparison
Inspect overall
topic prevalence
Compare overall
topic prevalence
across periods
• Topics change!
Measure change in
word probability
distributions for each
new topic wrt each
old topic
• Match new to closest
previous match
below change
threshold (otherwise
new topic)
• Evaluate trends!
Estimate and
inspect effects of
covariates
Compare effects
of covariates
across periods
•Output can be
interpreted similarly
to regression
Example results: Hotel reservations
Covariates: booking, caller distress
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
convention, center, mind, worry, philadelphia, inventory New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
school, college, graduate, medical, clinic
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
30% beach, balcony, ocean, view
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
10% back, next, receive, listen, cash future
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
back, minute, system, run, inconvenience
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Booking
42% confirm, email, arrival, local
New
Decreasing
Increasing
Hit: > 1% of words on call
assigned to a given topic
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Caller Distress
New
Decreasing
Increasing
Distress: > 30 seconds of linguistically-
identified dissatisfaction or negative emotion
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Caller Distress
square, city, price, hotel, manhattan, central
New
Decreasing
Increasing
Distress: > 30 seconds of linguistically-
identified dissatisfaction or negative emotion
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Trend Contextualization: Caller Distress
12% online, website, cancel, purchase, advance
New
Decreasing
Increasing
Distress: > 30 seconds of linguistically-
identified dissatisfaction or negative emotion
Nice!
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Our goals, revisited
Manageable number of topics
Track expected and unexpected topics
Go deep: Contextualize topic usage
©2016 Mattersight Corporation. Mattersight Restricted Confidential Information.
Topic trends using
structural topic models
Thank you!
DATA
SCIENCE
POP UP
AUSTIN
@datapopup
#datapopupaustin

Contenu connexe

En vedette

Millennials: Why you should care
Millennials: Why you should careMillennials: Why you should care
Millennials: Why you should careKristen Cosentino
 
Charlotte O'Brien - BioBamboo: An Example of Eco-Restoration
Charlotte O'Brien - BioBamboo: An Example of Eco-RestorationCharlotte O'Brien - BioBamboo: An Example of Eco-Restoration
Charlotte O'Brien - BioBamboo: An Example of Eco-Restorationgabriellebastien
 
Evolucion de la computadoras Unexpo
Evolucion de la computadoras UnexpoEvolucion de la computadoras Unexpo
Evolucion de la computadoras UnexpoAbrahan Molina
 
Cheetah Power Point
Cheetah Power PointCheetah Power Point
Cheetah Power PointKen_Rein
 
The New Explorers Guide To Dutch Digital Culture 2
The New Explorers Guide To Dutch Digital Culture 2The New Explorers Guide To Dutch Digital Culture 2
The New Explorers Guide To Dutch Digital Culture 2Fers
 
Empleo con apoyo. glosario.
Empleo con apoyo. glosario.Empleo con apoyo. glosario.
Empleo con apoyo. glosario.José María
 
La katana josmary patiño
La katana   josmary patiñoLa katana   josmary patiño
La katana josmary patiñojosmary patiño
 
Criando produtos e serviços reais para o mundo virtual.
Criando produtos e serviços reais para o mundo virtual.Criando produtos e serviços reais para o mundo virtual.
Criando produtos e serviços reais para o mundo virtual.Jane Vita
 
Presentation for anthro kieran
Presentation for anthro kieranPresentation for anthro kieran
Presentation for anthro kieranTom McLean
 
Social Media for Retails, Singapore
Social Media for Retails, SingaporeSocial Media for Retails, Singapore
Social Media for Retails, SingaporeHappy Marketer
 
Crash course in instruction
Crash course in instructionCrash course in instruction
Crash course in instructionabartholomew
 
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...Bomonnhi
 
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...Brenda Leibowitz
 
Blue ocean strategy –part 1
Blue ocean strategy –part 1Blue ocean strategy –part 1
Blue ocean strategy –part 1Pavan kumar
 
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan ProfilingFrom WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan ProfilingWaters Corporation
 
Personal Progression Framework
Personal Progression FrameworkPersonal Progression Framework
Personal Progression FrameworkOlusegun Agunbiade
 

En vedette (20)

Millennials: Why you should care
Millennials: Why you should careMillennials: Why you should care
Millennials: Why you should care
 
Charlotte O'Brien - BioBamboo: An Example of Eco-Restoration
Charlotte O'Brien - BioBamboo: An Example of Eco-RestorationCharlotte O'Brien - BioBamboo: An Example of Eco-Restoration
Charlotte O'Brien - BioBamboo: An Example of Eco-Restoration
 
Evolucion de la computadoras Unexpo
Evolucion de la computadoras UnexpoEvolucion de la computadoras Unexpo
Evolucion de la computadoras Unexpo
 
Cheetah Power Point
Cheetah Power PointCheetah Power Point
Cheetah Power Point
 
phao-updated resume
phao-updated resumephao-updated resume
phao-updated resume
 
The New Explorers Guide To Dutch Digital Culture 2
The New Explorers Guide To Dutch Digital Culture 2The New Explorers Guide To Dutch Digital Culture 2
The New Explorers Guide To Dutch Digital Culture 2
 
Empleo con apoyo. glosario.
Empleo con apoyo. glosario.Empleo con apoyo. glosario.
Empleo con apoyo. glosario.
 
La katana josmary patiño
La katana   josmary patiñoLa katana   josmary patiño
La katana josmary patiño
 
Criando produtos e serviços reais para o mundo virtual.
Criando produtos e serviços reais para o mundo virtual.Criando produtos e serviços reais para o mundo virtual.
Criando produtos e serviços reais para o mundo virtual.
 
My resume
My resumeMy resume
My resume
 
Resume
ResumeResume
Resume
 
Presentation for anthro kieran
Presentation for anthro kieranPresentation for anthro kieran
Presentation for anthro kieran
 
sunpark
sunparksunpark
sunpark
 
Social Media for Retails, Singapore
Social Media for Retails, SingaporeSocial Media for Retails, Singapore
Social Media for Retails, Singapore
 
Crash course in instruction
Crash course in instructionCrash course in instruction
Crash course in instruction
 
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
Nghiên cứu nồng độ Lipid máu và tình trạng gan nhiễm mỡ ở trẻ thừa cân – béo ...
 
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
Socially Just Pedagogies through the lens of 'new pedagogy studies' and in th...
 
Blue ocean strategy –part 1
Blue ocean strategy –part 1Blue ocean strategy –part 1
Blue ocean strategy –part 1
 
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan ProfilingFrom WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
From WCBP 2015: GlycoWorks RapiFluor-MS for Glycan Profiling
 
Personal Progression Framework
Personal Progression FrameworkPersonal Progression Framework
Personal Progression Framework
 

Similaire à Data Science Popup Austin: Using lda and Structural Topic Modeling to Explore Trending Topics in a Call Center

Insights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter contentInsights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter contentStephen Dann
 
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...Data Con LA
 
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from TextUse Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from TextAmazon Web Services
 
Irmac presentation for website
Irmac presentation for websiteIrmac presentation for website
Irmac presentation for websiteFrank Barnes
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentSandy Man
 
Serverless Text Analytics with Amazon Comprehend
Serverless Text Analytics with Amazon ComprehendServerless Text Analytics with Amazon Comprehend
Serverless Text Analytics with Amazon ComprehendDonnie Prakoso
 
Building Text Analytics Solutions with AWS ML Services
Building Text Analytics Solutions with AWS ML ServicesBuilding Text Analytics Solutions with AWS ML Services
Building Text Analytics Solutions with AWS ML ServicesAmazon Web Services
 
Analysis of Metadata and Topic Modeling for
Analysis of Metadata and Topic Modeling forAnalysis of Metadata and Topic Modeling for
Analysis of Metadata and Topic Modeling forJigar Mehta
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronMaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronDataWorks Summit
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...DataWorks Summit
 
Cognitive Systems
Cognitive SystemsCognitive Systems
Cognitive SystemsLukas Ott
 
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017Splunk
 
Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...Lucidworks
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessDataWorks Summit/Hadoop Summit
 
Georgetown Data Science - Team BuzzFeed
Georgetown Data Science - Team BuzzFeed Georgetown Data Science - Team BuzzFeed
Georgetown Data Science - Team BuzzFeed Joshua Erb
 
Experiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter ZadroznyExperiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter Zadroznypadatascience
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...ryanorban
 
Basic Sentiment Analysis using Hive
Basic Sentiment Analysis using HiveBasic Sentiment Analysis using Hive
Basic Sentiment Analysis using HiveQubole
 

Similaire à Data Science Popup Austin: Using lda and Structural Topic Modeling to Explore Trending Topics in a Call Center (20)

Insights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter contentInsights into the Twitterverse: Benchmarking and analysis twitter content
Insights into the Twitterverse: Benchmarking and analysis twitter content
 
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
Big Data Day LA 2015 - Data Science at Whisper - From content quality to pers...
 
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from TextUse Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
Use Amazon Comprehend and Amazon SageMaker to Gain Insight from Text
 
Irmac presentation for website
Irmac presentation for websiteIrmac presentation for website
Irmac presentation for website
 
Rigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deploymentRigourous evaluation of nlp models in real world deployment
Rigourous evaluation of nlp models in real world deployment
 
Serverless Text Analytics with Amazon Comprehend
Serverless Text Analytics with Amazon ComprehendServerless Text Analytics with Amazon Comprehend
Serverless Text Analytics with Amazon Comprehend
 
Building Text Analytics Solutions with AWS ML Services
Building Text Analytics Solutions with AWS ML ServicesBuilding Text Analytics Solutions with AWS ML Services
Building Text Analytics Solutions with AWS ML Services
 
Analysis of Metadata and Topic Modeling for
Analysis of Metadata and Topic Modeling forAnalysis of Metadata and Topic Modeling for
Analysis of Metadata and Topic Modeling for
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache MetronMaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
 
Cognitive Systems
Cognitive SystemsCognitive Systems
Cognitive Systems
 
Key Phrases for Better Search
Key Phrases for Better SearchKey Phrases for Better Search
Key Phrases for Better Search
 
A6 big data_in_the_cloud
A6 big data_in_the_cloudA6 big data_in_the_cloud
A6 big data_in_the_cloud
 
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
Using Splunk to Defend Against Advanced Threats - Webinar Slides: November 2017
 
Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...Search++: Cognitive transformation of human-system interaction: Presented by ...
Search++: Cognitive transformation of human-system interaction: Presented by ...
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awareness
 
Georgetown Data Science - Team BuzzFeed
Georgetown Data Science - Team BuzzFeed Georgetown Data Science - Team BuzzFeed
Georgetown Data Science - Team BuzzFeed
 
Experiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter ZadroznyExperiences with Sentiment Analysis with Peter Zadrozny
Experiences with Sentiment Analysis with Peter Zadrozny
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
 
Basic Sentiment Analysis using Hive
Basic Sentiment Analysis using HiveBasic Sentiment Analysis using Hive
Basic Sentiment Analysis using Hive
 

Plus de Domino Data Lab

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...Domino Data Lab
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...Domino Data Lab
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataDomino Data Lab
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itDomino Data Lab
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationDomino Data Lab
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryDomino Data Lab
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusDomino Data Lab
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterDomino Data Lab
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceDomino Data Lab
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Domino Data Lab
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Domino Data Lab
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at ScaleDomino Data Lab
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataDomino Data Lab
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data ScientistsDomino Data Lab
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Domino Data Lab
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyDomino Data Lab
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino Data Lab
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceDomino Data Lab
 

Plus de Domino Data Lab (20)

What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...What's in your workflow? Bringing data science workflows to business analysis...
What's in your workflow? Bringing data science workflows to business analysis...
 
The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
 
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
 

Dernier

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 

Dernier (20)

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 

Data Science Popup Austin: Using lda and Structural Topic Modeling to Explore Trending Topics in a Call Center

  • 1. DATA SCIENCE POP UP AUSTIN Using LDA and Structural Topic Modeling to Explore Trending Topics in a Call Center Jordana Heller Data Scientist, Mattersight jheller
  • 2.
  • 4. Lightning Talk: Using LDA and Structural Topic Modeling to Explore Trending Topics in a Call Center Jordana Heller @jheller Data Science Pop-up Austin, April 13, 2016
  • 5. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. What We Do
  • 6. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Our goal: Topic Trends 3/31/2016 4/30/2016 5/31/2016 6/30/2016 7/31/2016 Identifying contents and prevalence of multiword topics present in conversation in an unsupervised way Unexpected Prevalence Critical Spikes Escalating Frequency
  • 7. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Our goals, continued Manageable number of topics Track expected and unexpected topics Go deep: Contextualize topic usage
  • 8. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Short text: Keywords, hashtags, ngrams
  • 9. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Long text: Could use predetermined topics Image credit: IBM Watson Concept Insights
  • 10. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Long text: Or discover themes Image credit: Blei, 2012, Communications of the ACM Latent Dirichlet Allocation (LDA) (Blei et al., 2003)
  • 11. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Great! How about contextualizing trends? • Where are topics trending? • Structural Topic Modeling (Roberts et al., 2013) – Instead of relying on post-hoc comparisons, includes covariates in LDA model • Specifies priors as GLMs • Word distribution determined by topic, covariates, topic-covariate interaction – Authors’ implementation: R package stm (available via CRAN; all code on GitHub!)
  • 12. Ready to talk pipeline!
  • 13. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Data Collection and Preprocessing Read Transcripts Add Call-level Covariates Preprocess text • Collocations • -Stop words • Stem/completion • -Low freq terms Create Term- Document Matrix
  • 14. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Topic Model Creation Retrieve last topic model • For comparison Create current topic model • Detect number of topics, or specify Create topic labels
  • 15. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Topic Model Comparison Inspect overall topic prevalence Compare overall topic prevalence across periods • Topics change! Measure change in word probability distributions for each new topic wrt each old topic • Match new to closest previous match below change threshold (otherwise new topic) • Evaluate trends! Estimate and inspect effects of covariates Compare effects of covariates across periods •Output can be interpreted similarly to regression
  • 16. Example results: Hotel reservations Covariates: booking, caller distress
  • 17. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking convention, center, mind, worry, philadelphia, inventory New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 18. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking school, college, graduate, medical, clinic New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 19. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking 30% beach, balcony, ocean, view New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 20. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking 10% back, next, receive, listen, cash future New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 21. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking back, minute, system, run, inconvenience New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 22. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Booking 42% confirm, email, arrival, local New Decreasing Increasing Hit: > 1% of words on call assigned to a given topic
  • 23. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Caller Distress New Decreasing Increasing Distress: > 30 seconds of linguistically- identified dissatisfaction or negative emotion
  • 24. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Caller Distress square, city, price, hotel, manhattan, central New Decreasing Increasing Distress: > 30 seconds of linguistically- identified dissatisfaction or negative emotion
  • 25. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Trend Contextualization: Caller Distress 12% online, website, cancel, purchase, advance New Decreasing Increasing Distress: > 30 seconds of linguistically- identified dissatisfaction or negative emotion
  • 26. Nice!
  • 27. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Our goals, revisited Manageable number of topics Track expected and unexpected topics Go deep: Contextualize topic usage
  • 28. ©2016 Mattersight Corporation. Mattersight Restricted Confidential Information. Topic trends using structural topic models Thank you!