SlideShare une entreprise Scribd logo
1  sur  33
Churn Prediction: Understanding
your customers and taking action.
@datoinc
#churnPredictionDato
Hi! My name is …
Antoine Atallah
Principal Data Scientist
Dato toolkits team, novice powerlifter, Hawks fan.
2
Hi! My name is …
#churnPredictionDato
Hi! My name is …
Karla Vega
Customer Success Manager
Aerospace engineer, dog trainer, running fan
@vegakp
3
Hi! My name is …
#churnPredictionDato
About Us!
#churnPredictionDato
+ =
Questions?
• (Now) we love questions. Feel free to interrupt for questions!
• (Later) Email us antoine@dato.com, vega@dato.com
Webinar!
#churnPredictionDato
Extracting Insights from Data
Data Science Workflow
Ingest Transform Model Insight
#churnPredictionDato
Log Journey
Lots of data
Insights Profits
#churnPredictionDato
Mining Log Data
Logs are everywhere!
#churnPredictionDato
Different kinds of logs
• Raw logs
• Each row containing an individual event for a user, at a given
time
• Aggregated logs
• Each row contains the interactions for a user over a period of
time
• For instance, user activity over one-month rollups
• This is the traditional data output of Business Intelligence
infrastructures
• User side-data
• Information about each user (demographics, etc…)
#churnPredictionDato
Logs contain usage patterns
Small Purchase
Large Purchase
#churnPredictionDato
Different kinds usage patterns
Kinds of Patterns
Visits, Purchases, Events Frequency
Visits, Purchase Quantity
Changes in value over time
Change in time between visits, purchases, events
Time since last action or visit
Demographic information (age, gender, …)
Types of items purchased (seasonality, quality)
…
#churnPredictionDato
Retaining customers/visitors is important
• Cost to acquire a new customer is high vs retaining a customer
• Gives a pulse on the health of the business
• Can help take preventive actions and act before it’s too late
• Can help create more effective marketing campaigns
#churnPredictionDato
What is Churn Prediction
What is Churn
• Churn Prediction is predicting user’s probability to stop coming
back (churn)
• Works by observing past user behavior
#churnPredictionDato
Churn Prediction
#churnPredictionDato
(Apr 2016)
Daily activity logs for Jan 2015 – April
2016
More Precisely
• Churn Prediction is predicting user’s probability to stop coming
back (churn)
• Works by observing past user behavior
• We define a time boundary at which we want to predict churn
• Anyone not present N days (default is 30) after the boundary is
considered to have churned
• The M days (default 60) before the boundary are used to
generate features
• Multiple boundaries can be specified to extract more patterns
#churnPredictionDato
Feature and Label Generation
#churnPredictionDato
(Apr 2016)
Daily activity logs for Jan 2015 – April 2016
How to use Churn Prediction
Choosing Time Boundaries
• Time Boundaries are moments in the past that are used to
observe user behavior and generate labels
• The time before the boundary is used to observe patterns
• The time after the boundary is used to generate labels
Boundaries Meaning
January 1st 2016 Will use the patterns from before January 1st 2016 to
predict User Churn after January 1st 2016
January 1st 2016,
December 1st 2015
Will use the patterns from before January 1st 2016 to
predict User Churn after January 1st 2016;
Will use the patterns from before December 1st 2015 to
predict User Churn after December 1st 2015
This will analyze more patterns and build a richer model
#churnPredictionDato
Choosing a Churn Period
• The Churn Period corresponds to how far in the future we want to
predict.
• It also means that for training purposes, users who have not been
active for this amount of time will be considered to have churned
Churn Period Predicts
7 Days Probability for each user to be leaving next week
30 Days Probability for each user to be leaving next month
3 Months Probability for each user to be leaving next quarter
#churnPredictionDato
Choosing Lookback Periods
• Lookback Periods is how far in the past we look to extract user
behavior patterns (features)
• Multiple lookback periods can be provided to generate richer
features
Lookback Periods Features
3 Days Will use the 3 days before each Time Boundary
to extract usage patterns
30 Days Will use the 30 days before each Time
Boundary to extract usage patterns
7 Days, 1 Month Will use the week and the month before each
Time Boundary to extract usage patterns
#churnPredictionDato
Choosing appropriate parameters
• If we want to predict Churn for this quarter, we might want to set:
• Churn Period to be 3 Months (how far in the future we predict)
• Lookback Periods to be 2, 4, 8, 16 weeks (how far in the past
to extract patterns from)
• Time Boundaries to be January 1st 2016, January 1st 2015,
January 1st 2014
• Notice that we chose the same quarter each year for Time
Boundary
• Choosing past data with the same underlying behavior will
provide more accurate predictions
#churnPredictionDato
Choosing appropriate parameters
• If we want to predict Churn for this month, we might want to set:
• Churn Period to be 1 Month (how far in the future we predict)
• Lookback Periods to be 7, 14, 30, 60 days (how far in the past
to extract patterns from)
• Time Boundaries to be January 1st 2016, October 1st 2015,
September 1st 2015, August 1st 2015
• In this case, we intentionally skipped over November and
December 2015 since it is the holiday season, and may exhibit
very different behavior
#churnPredictionDato
Key Takeaways
• Label generation is extremely simplified (choose a Churn Period)
• Feature generation is extremely simplified (choose Lookback
Periods and Time Boundaries)
• Choose representative time frames to predict churn in the desired
time frame
#churnPredictionDato
Interpreting the Results
Output of the model
• The Churn Prediction model returns a probability of churn for
each provided user
#churnPredictionDato
Using the Probabilities
Churn Probability
NumberofUsers
High Probability of
Churn:
Might be hard to
rescue these users
Mid-Probability of
Churn:
We should try to
rescue these users
Low-Probability of
Churn:
Send a thank-you
note!
#churnPredictionDato
Using the Probabilities
• We can target different users, using their probability of Churn as
a guideline
• Different marketing messages can be created based on the
probability of Churn
• The highest-probability users are not always the best to target,
depending on the cost of the action to take to retain them
• Gives a new dimension on the user base
• Can be used to monitor the health of the user population over
time
#churnPredictionDato
Demo
Summary
Log Data Mining
≠
Rocket Science
• Define time parameters to identify patterns and generate labels.
• Extract predictions to gain insights about your user population.
• Take action and help grow your healthy business.
Churn Prediction
#churnPredictionDato
SELECT questions FROM audience
WHERE difficulty == “Easy”
Thanks!

Contenu connexe

En vedette

Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsTuri, Inc.
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Turi, Inc.
 
Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015Hao Chen
 
Accenture maximizing-customer-retention
Accenture maximizing-customer-retentionAccenture maximizing-customer-retention
Accenture maximizing-customer-retentionKhellil Khellil
 
Data science in_action
Data science in_actionData science in_action
Data science in_actionJi Li
 
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign ManagementT-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign ManagementVivastream
 
Customer attrition and churn modeling
Customer attrition and churn modelingCustomer attrition and churn modeling
Customer attrition and churn modelingMariya Korsakova
 
Developing Distributed Web Applications, Where does REST fit in?
Developing Distributed Web Applications, Where does REST fit in?Developing Distributed Web Applications, Where does REST fit in?
Developing Distributed Web Applications, Where does REST fit in?Srinath Perera
 
Analytics, KPIs for effective Churn & Loyalty management
Analytics, KPIs for effective Churn & Loyalty managementAnalytics, KPIs for effective Churn & Loyalty management
Analytics, KPIs for effective Churn & Loyalty managementEhtisham Rao
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with RPoo Kuan Hoong
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
 
Presentation Churn Management
Presentation Churn ManagementPresentation Churn Management
Presentation Churn Managementfarhanmajeed
 
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopHadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopDataWorks Summit
 
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and PythonTravis Oliphant
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenPoo Kuan Hoong
 
Customer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in TelecomCustomer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in TelecomChris Chen
 
Siddhi: A Second Look at Complex Event Processing Implementations
Siddhi: A Second Look at Complex Event Processing ImplementationsSiddhi: A Second Look at Complex Event Processing Implementations
Siddhi: A Second Look at Complex Event Processing ImplementationsSrinath Perera
 
From Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllFrom Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllDataWorks Summit
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom Hong Bui Van
 

En vedette (20)

Intelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning ToolkitsIntelligent Applications with Machine Learning Toolkits
Intelligent Applications with Machine Learning Toolkits
 
Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)Webinar - Know Your Customer - Arya (20160526)
Webinar - Know Your Customer - Arya (20160526)
 
Crystal qube™ presentation tpr
Crystal qube™ presentation tprCrystal qube™ presentation tpr
Crystal qube™ presentation tpr
 
Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015
 
Accenture maximizing-customer-retention
Accenture maximizing-customer-retentionAccenture maximizing-customer-retention
Accenture maximizing-customer-retention
 
Data science in_action
Data science in_actionData science in_action
Data science in_action
 
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign ManagementT-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
T-Mobile: Kiss Churn Goodbye with Data-Driven Campaign Management
 
Customer attrition and churn modeling
Customer attrition and churn modelingCustomer attrition and churn modeling
Customer attrition and churn modeling
 
Developing Distributed Web Applications, Where does REST fit in?
Developing Distributed Web Applications, Where does REST fit in?Developing Distributed Web Applications, Where does REST fit in?
Developing Distributed Web Applications, Where does REST fit in?
 
Analytics, KPIs for effective Churn & Loyalty management
Analytics, KPIs for effective Churn & Loyalty managementAnalytics, KPIs for effective Churn & Loyalty management
Analytics, KPIs for effective Churn & Loyalty management
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Presentation Churn Management
Presentation Churn ManagementPresentation Churn Management
Presentation Churn Management
 
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopHadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
 
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
Customer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in TelecomCustomer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in Telecom
 
Siddhi: A Second Look at Complex Event Processing Implementations
Siddhi: A Second Look at Complex Event Processing ImplementationsSiddhi: A Second Look at Complex Event Processing Implementations
Siddhi: A Second Look at Complex Event Processing Implementations
 
From Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllFrom Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for All
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom
 

Similaire à Webinar - Pattern Mining Log Data - Vega (20160426)

Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel FatulescuIasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel FatulescuCodecamp Romania
 
Scrum - What is it good for?
Scrum - What is it good for?Scrum - What is it good for?
Scrum - What is it good for?Diana Minnée
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxMafaldaMoreira18
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxAdarshMasekar
 
EN What Time Is It_ by Slidesgo_.pptx
EN What Time Is It_ by Slidesgo_.pptxEN What Time Is It_ by Slidesgo_.pptx
EN What Time Is It_ by Slidesgo_.pptxEmmanuelCampo2
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxNoorJehanArif
 
Save Time and Increase Traffic with Tailwind
Save Time and Increase Traffic with TailwindSave Time and Increase Traffic with Tailwind
Save Time and Increase Traffic with TailwindFLBlogCon
 
How to Perform Churn Analysis for your Mobile Application?
How to Perform Churn Analysis for your Mobile Application?How to Perform Churn Analysis for your Mobile Application?
How to Perform Churn Analysis for your Mobile Application?Tatvic Analytics
 
[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...
[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...
[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...Product Camp Brasil
 
Agile Estimating & Planning by Amaad Qureshi
Agile Estimating & Planning by Amaad QureshiAgile Estimating & Planning by Amaad Qureshi
Agile Estimating & Planning by Amaad QureshiAmaad Qureshi
 
Software estimation techniques
Software estimation techniquesSoftware estimation techniques
Software estimation techniquesAndré Pitombeira
 
Earthquake shakes twitter users
Earthquake shakes twitter usersEarthquake shakes twitter users
Earthquake shakes twitter usersEshan Mudwel
 
Sensors Aren't Enough
Sensors Aren't EnoughSensors Aren't Enough
Sensors Aren't EnoughC4Media
 
Summarization and opinion detection in product reviews
Summarization and opinion detection in product reviewsSummarization and opinion detection in product reviews
Summarization and opinion detection in product reviewspapanaboinasuman
 
Beyond Story Points - Forecasting with empirical data
Beyond Story Points - Forecasting with empirical dataBeyond Story Points - Forecasting with empirical data
Beyond Story Points - Forecasting with empirical dataMark Barber
 
3 Scrum Patterns to Boost Team Productivity
3 Scrum Patterns to Boost Team Productivity3 Scrum Patterns to Boost Team Productivity
3 Scrum Patterns to Boost Team Productivityardutta
 
Nondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of UsNondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of UsTomer Gabel
 

Similaire à Webinar - Pattern Mining Log Data - Vega (20160426) (20)

Agile Scrum Estimation
Agile   Scrum EstimationAgile   Scrum Estimation
Agile Scrum Estimation
 
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel FatulescuIasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
Iasi CodeCamp 20 april 2013 Agile Estimations and Planning - Cornel Fatulescu
 
Scrum - What is it good for?
Scrum - What is it good for?Scrum - What is it good for?
Scrum - What is it good for?
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptx
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptx
 
EN What Time Is It_ by Slidesgo_.pptx
EN What Time Is It_ by Slidesgo_.pptxEN What Time Is It_ by Slidesgo_.pptx
EN What Time Is It_ by Slidesgo_.pptx
 
EN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptxEN What Time Is It_ by Slidesgo.pptx
EN What Time Is It_ by Slidesgo.pptx
 
Save Time and Increase Traffic with Tailwind
Save Time and Increase Traffic with TailwindSave Time and Increase Traffic with Tailwind
Save Time and Increase Traffic with Tailwind
 
How to Perform Churn Analysis for your Mobile Application?
How to Perform Churn Analysis for your Mobile Application?How to Perform Churn Analysis for your Mobile Application?
How to Perform Churn Analysis for your Mobile Application?
 
[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...
[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...
[Pcamp19] - Prototyping the Pivotal Moments First: Visualizing the Forks in t...
 
Agile Estimating & Planning by Amaad Qureshi
Agile Estimating & Planning by Amaad QureshiAgile Estimating & Planning by Amaad Qureshi
Agile Estimating & Planning by Amaad Qureshi
 
Software estimation techniques
Software estimation techniquesSoftware estimation techniques
Software estimation techniques
 
Earthquake shakes twitter users
Earthquake shakes twitter usersEarthquake shakes twitter users
Earthquake shakes twitter users
 
Estimation
EstimationEstimation
Estimation
 
Sensors Aren't Enough
Sensors Aren't EnoughSensors Aren't Enough
Sensors Aren't Enough
 
Summarization and opinion detection in product reviews
Summarization and opinion detection in product reviewsSummarization and opinion detection in product reviews
Summarization and opinion detection in product reviews
 
Beyond Story Points - Forecasting with empirical data
Beyond Story Points - Forecasting with empirical dataBeyond Story Points - Forecasting with empirical data
Beyond Story Points - Forecasting with empirical data
 
3 Scrum Patterns to Boost Team Productivity
3 Scrum Patterns to Boost Team Productivity3 Scrum Patterns to Boost Team Productivity
3 Scrum Patterns to Boost Team Productivity
 
Scrum
ScrumScrum
Scrum
 
Nondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of UsNondeterministic Software for the Rest of Us
Nondeterministic Software for the Rest of Us
 

Plus de Turi, Inc.

Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission RiskTuri, Inc.
 
Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Turi, Inc.
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Turi, Inc.
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine LearningTuri, Inc.
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinTuri, Inc.
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data scienceTuri, Inc.
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Turi, Inc.
 
Introduction to Recommender Systems
Introduction to Recommender SystemsIntroduction to Recommender Systems
Introduction to Recommender SystemsTuri, Inc.
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in productionTuri, Inc.
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringTuri, Inc.
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with DatoTuri, Inc.
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Turi, Inc.
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTuri, Inc.
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemTuri, Inc.
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsTuri, Inc.
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Turi, Inc.
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Turi, Inc.
 
Pandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperiencePandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperienceTuri, Inc.
 

Plus de Turi, Inc. (20)

Webinar - Patient Readmission Risk
Webinar - Patient Readmission RiskWebinar - Patient Readmission Risk
Webinar - Patient Readmission Risk
 
Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)Webinar - Product Matching - Palombo (20160428)
Webinar - Product Matching - Palombo (20160428)
 
Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)Webinar - Fraud Detection - Palombo (20160428)
Webinar - Fraud Detection - Palombo (20160428)
 
Text Analysis with Machine Learning
Text Analysis with Machine LearningText Analysis with Machine Learning
Text Analysis with Machine Learning
 
Machine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos GuestrinMachine Learning in 2016: Live Q&A with Carlos Guestrin
Machine Learning in 2016: Live Q&A with Carlos Guestrin
 
Scalable data structures for data science
Scalable data structures for data scienceScalable data structures for data science
Scalable data structures for data science
 
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
Introduction to Deep Learning for Image Analysis at Strata NYC, Sep 2015
 
Introduction to Recommender Systems
Introduction to Recommender SystemsIntroduction to Recommender Systems
Introduction to Recommender Systems
 
Machine learning in production
Machine learning in productionMachine learning in production
Machine learning in production
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
SFrame
SFrameSFrame
SFrame
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with Dato
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Dato Keynote
Dato KeynoteDato Keynote
Dato Keynote
 
New Capabilities in the PyData Ecosystem
New Capabilities in the PyData EcosystemNew Capabilities in the PyData Ecosystem
New Capabilities in the PyData Ecosystem
 
Anomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation ForestsAnomaly Detection Using Isolation Forests
Anomaly Detection Using Isolation Forests
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
 
Pandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data ExperiencePandas & Cloudera: Scaling the Python Data Experience
Pandas & Cloudera: Scaling the Python Data Experience
 

Dernier

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 

Dernier (20)

Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 

Webinar - Pattern Mining Log Data - Vega (20160426)

  • 1. Churn Prediction: Understanding your customers and taking action. @datoinc #churnPredictionDato
  • 2. Hi! My name is … Antoine Atallah Principal Data Scientist Dato toolkits team, novice powerlifter, Hawks fan. 2 Hi! My name is … #churnPredictionDato
  • 3. Hi! My name is … Karla Vega Customer Success Manager Aerospace engineer, dog trainer, running fan @vegakp 3 Hi! My name is … #churnPredictionDato
  • 5. + = Questions? • (Now) we love questions. Feel free to interrupt for questions! • (Later) Email us antoine@dato.com, vega@dato.com Webinar! #churnPredictionDato
  • 7. Data Science Workflow Ingest Transform Model Insight #churnPredictionDato
  • 8. Log Journey Lots of data Insights Profits #churnPredictionDato
  • 11. Different kinds of logs • Raw logs • Each row containing an individual event for a user, at a given time • Aggregated logs • Each row contains the interactions for a user over a period of time • For instance, user activity over one-month rollups • This is the traditional data output of Business Intelligence infrastructures • User side-data • Information about each user (demographics, etc…) #churnPredictionDato
  • 12. Logs contain usage patterns Small Purchase Large Purchase #churnPredictionDato
  • 13. Different kinds usage patterns Kinds of Patterns Visits, Purchases, Events Frequency Visits, Purchase Quantity Changes in value over time Change in time between visits, purchases, events Time since last action or visit Demographic information (age, gender, …) Types of items purchased (seasonality, quality) … #churnPredictionDato
  • 14. Retaining customers/visitors is important • Cost to acquire a new customer is high vs retaining a customer • Gives a pulse on the health of the business • Can help take preventive actions and act before it’s too late • Can help create more effective marketing campaigns #churnPredictionDato
  • 15. What is Churn Prediction
  • 16. What is Churn • Churn Prediction is predicting user’s probability to stop coming back (churn) • Works by observing past user behavior #churnPredictionDato
  • 17. Churn Prediction #churnPredictionDato (Apr 2016) Daily activity logs for Jan 2015 – April 2016
  • 18. More Precisely • Churn Prediction is predicting user’s probability to stop coming back (churn) • Works by observing past user behavior • We define a time boundary at which we want to predict churn • Anyone not present N days (default is 30) after the boundary is considered to have churned • The M days (default 60) before the boundary are used to generate features • Multiple boundaries can be specified to extract more patterns #churnPredictionDato
  • 19. Feature and Label Generation #churnPredictionDato (Apr 2016) Daily activity logs for Jan 2015 – April 2016
  • 20. How to use Churn Prediction
  • 21. Choosing Time Boundaries • Time Boundaries are moments in the past that are used to observe user behavior and generate labels • The time before the boundary is used to observe patterns • The time after the boundary is used to generate labels Boundaries Meaning January 1st 2016 Will use the patterns from before January 1st 2016 to predict User Churn after January 1st 2016 January 1st 2016, December 1st 2015 Will use the patterns from before January 1st 2016 to predict User Churn after January 1st 2016; Will use the patterns from before December 1st 2015 to predict User Churn after December 1st 2015 This will analyze more patterns and build a richer model #churnPredictionDato
  • 22. Choosing a Churn Period • The Churn Period corresponds to how far in the future we want to predict. • It also means that for training purposes, users who have not been active for this amount of time will be considered to have churned Churn Period Predicts 7 Days Probability for each user to be leaving next week 30 Days Probability for each user to be leaving next month 3 Months Probability for each user to be leaving next quarter #churnPredictionDato
  • 23. Choosing Lookback Periods • Lookback Periods is how far in the past we look to extract user behavior patterns (features) • Multiple lookback periods can be provided to generate richer features Lookback Periods Features 3 Days Will use the 3 days before each Time Boundary to extract usage patterns 30 Days Will use the 30 days before each Time Boundary to extract usage patterns 7 Days, 1 Month Will use the week and the month before each Time Boundary to extract usage patterns #churnPredictionDato
  • 24. Choosing appropriate parameters • If we want to predict Churn for this quarter, we might want to set: • Churn Period to be 3 Months (how far in the future we predict) • Lookback Periods to be 2, 4, 8, 16 weeks (how far in the past to extract patterns from) • Time Boundaries to be January 1st 2016, January 1st 2015, January 1st 2014 • Notice that we chose the same quarter each year for Time Boundary • Choosing past data with the same underlying behavior will provide more accurate predictions #churnPredictionDato
  • 25. Choosing appropriate parameters • If we want to predict Churn for this month, we might want to set: • Churn Period to be 1 Month (how far in the future we predict) • Lookback Periods to be 7, 14, 30, 60 days (how far in the past to extract patterns from) • Time Boundaries to be January 1st 2016, October 1st 2015, September 1st 2015, August 1st 2015 • In this case, we intentionally skipped over November and December 2015 since it is the holiday season, and may exhibit very different behavior #churnPredictionDato
  • 26. Key Takeaways • Label generation is extremely simplified (choose a Churn Period) • Feature generation is extremely simplified (choose Lookback Periods and Time Boundaries) • Choose representative time frames to predict churn in the desired time frame #churnPredictionDato
  • 28. Output of the model • The Churn Prediction model returns a probability of churn for each provided user #churnPredictionDato
  • 29. Using the Probabilities Churn Probability NumberofUsers High Probability of Churn: Might be hard to rescue these users Mid-Probability of Churn: We should try to rescue these users Low-Probability of Churn: Send a thank-you note! #churnPredictionDato
  • 30. Using the Probabilities • We can target different users, using their probability of Churn as a guideline • Different marketing messages can be created based on the probability of Churn • The highest-probability users are not always the best to target, depending on the cost of the action to take to retain them • Gives a new dimension on the user base • Can be used to monitor the health of the user population over time #churnPredictionDato
  • 31. Demo
  • 32. Summary Log Data Mining ≠ Rocket Science • Define time parameters to identify patterns and generate labels. • Extract predictions to gain insights about your user population. • Take action and help grow your healthy business. Churn Prediction #churnPredictionDato
  • 33. SELECT questions FROM audience WHERE difficulty == “Easy” Thanks!

Notes de l'éditeur

  1. Not sure if demo first or later?