SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
How to data model Churn
Real life examples
Quick quizz
•  How many of you are familiar with Churn issue?
•  with Machine Learning?
Logistic Regression, Random Forest, Gradient Boosting trees?
(Not the subject here)
•  With SQL?
(we may see some code later)
•  What database tech do you use?
What about EMC Greenplum or Vertica?
Who I am
•  Senior Data Scientist at Dataiku
(worked on churn prediction, fraud detection, bot detection, recommender systems,
graph analytics, smart cities, … )
•  Occasional Kaggle competitor
•  Mostly code with python and SQL
•  Twitter @prrgutierrez
Churn definition
•  Wikipedia:
“Churn rate (sometimes called attrition rate), in its broadest sense, is a measure of the
number of individuals or items moving out of a collective group over a specific period of
time”
= Customer leaving
Two types of Churn
•  Subscription models:
•  Telco
•  E-gamming (Wow)
•  Ex : Coyote -> 1 year subscription
-> you know when someone leave
•  Non subscription models:
•  E-Business (Amazon, Price Minister, Vente Privée)
•  E-gamming (Candy Crush, free MMORPG)
-> you approximate someone leaving
Candy Crush: days / weeks
MMORPG: 2 months (holidays)
Price Minister: months
Two types of Churn
•  Blurred Separation:
•  Ex: T-mobile: 1 month subscription -> paying each call
•  Ex: Wow: 1 month to 6 month subscription
•  Banking?
•  Focus : no subscription:
•  Can be seen as a generalization where you have to approximate the target
•  Bonus : Seller churn
•  Market places
•  Clients that participate product life
•  Forums (Reddit)
•  E-gamming (Korean competitions, guilds etc.)
Dealing with churn
•  Motivations :
•  Saturated market
-> cost get new client >>> cost keep client
•  Ex : http://www.bain.com/publications/articles/breaking-the-back-of-customer-churn.aspx
•  Wireline company : 2% to 2.5 % churn rate per month.
•  If 5 M customers -> 1.32 M churn per year
•  When reducing from 2.5% to 2% lowest estimation : 240 M $ in 18 month
Dealing with churn
•  Predict churn :
•  One model for performance <- our focus, short term, more ML
•  One model for understanding <- long term, more Analytics
•  Act on it (short term) :
•  Special offer (telco call, free in game money, discount coupon … )
•  Does it work? Feedback loop needed!
•  Model probabilities of leaving because of offer. A/B tests. Multi arms Bandit?
•  Significant LTV for activation?
•  Act on it (long term) :
•  Is there a problem in my purchasing funnel?
•  Is the game too hard at some point?
Dealing with churn
•  Candy Crush Rumor :
•  Change the distribution of
probabilities of candies / bombs
•  Change the difficulty of the
game
•  Loosing a lot makes the game
easier
Modelling Churn
•  Machine learning model (classification) -> target:
•  Known in subscription
•  Unknown in general
•  Step 1 : Maintain customer status
•  Do you care only about your best?
•  Anyway churn action won’t be the same
•  Has a client churned?
-> target = churner = don’t buy / visit since time X
-> best = buy / visit more than y since time Y
•  Can be refined (“new customer”, several class of best or inactive, reactivated…)
•  Storage : maintain only the difference!
Modelling Churn
•  Machine learning model -> features:
•  Explicative factors to use as input for the model
•  Step 2 : Maintain customer features
•  Social (woman, age, etc.)
•  Behavioral!
•  Utilization / buying rate
•  Trend in utilization / buying rate
•  Ad hoc features :
•  WoW / Social game churn: take into account friend network churn
•  Telco: call to call centers
•  Beware of time dependence!
Data Model
Computation Dependency diagram
Ex : Train and predict scheme
Time	
  
T	
  :	
  present	
  ,me	
  T	
  –	
  4	
  month	
  
Data	
  is	
  used	
  for	
  target	
  
crea,on	
  :	
  ac,vity	
  during	
  
the	
  last	
  4	
  months	
  
Data	
  is	
  used	
  for	
  feature	
  
genera,on.	
  
Use	
  model	
  to	
  predict	
  
future	
  churn	
  
Train	
  model	
  using	
  features	
  and	
  target	
  
Ex : Train Evaluation and Predict Scheme
Time	
  
T	
  :	
  present	
  ,me	
  T	
  –	
  4	
  month	
  
Data	
  is	
  used	
  for	
  target	
  
crea,on	
  :	
  ac,vity	
  during	
  
the	
  last	
  4	
  months	
  
Data	
  is	
  used	
  for	
  
feature	
  genera,on	
  
Valida&on	
  set	
  
Use	
  model	
  to	
  
predict	
  future	
  
churn	
  
Training	
  
Evaluate	
  on	
  the	
  target	
  
of	
  the	
  valida,on	
  set	
  
T	
  –	
  8	
  month	
  
Data	
  is	
  used	
  for	
  features	
  
genera,on.	
  
Data	
  is	
  used	
  for	
  target	
  
crea,on	
  :	
  ac,vity	
  during	
  
the	
  last	
  4	
  months	
  
Thank you for your attention !

Contenu connexe

En vedette

Churn model for telecom
Churn model for telecomChurn model for telecom
Churn model for telecomAmit Kumar
 
How to Build Successful Data Team - Dataiku ?
How to Build Successful Data Team -  Dataiku ? How to Build Successful Data Team -  Dataiku ?
How to Build Successful Data Team - Dataiku ? Dataiku
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with RPoo Kuan Hoong
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
 
Churn Analysis in Telecom Industry
Churn Analysis in Telecom IndustryChurn Analysis in Telecom Industry
Churn Analysis in Telecom IndustrySatyam Barsaiyan
 
Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Dataiku
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenPoo Kuan Hoong
 
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...Amazon Web Services
 
Beyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modelingBeyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modelingPierre Gutierrez
 
Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Gunvansh Khanna
 
Replication Internals: The Life of a Write
Replication Internals: The Life of a WriteReplication Internals: The Life of a Write
Replication Internals: The Life of a WriteMongoDB
 
Online Games Analytics - Data Science for Fun
Online Games Analytics - Data Science for FunOnline Games Analytics - Data Science for Fun
Online Games Analytics - Data Science for FunDataiku
 
Dataiku - google cloud platform roadshow - october 2013
Dataiku  - google cloud platform roadshow - october 2013Dataiku  - google cloud platform roadshow - october 2013
Dataiku - google cloud platform roadshow - october 2013Dataiku
 
"Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ..."Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ...Dataconomy Media
 
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014Dataiku
 
Eat whatever you can with PyBabe
Eat whatever you can with PyBabeEat whatever you can with PyBabe
Eat whatever you can with PyBabeDataiku
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascadingDataiku
 
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16thDataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16thDataiku
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku
 

En vedette (20)

Churn model for telecom
Churn model for telecomChurn model for telecom
Churn model for telecom
 
How to Build Successful Data Team - Dataiku ?
How to Build Successful Data Team -  Dataiku ? How to Build Successful Data Team -  Dataiku ?
How to Build Successful Data Team - Dataiku ?
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 
Churn Analysis in Telecom Industry
Churn Analysis in Telecom IndustryChurn Analysis in Telecom Industry
Churn Analysis in Telecom Industry
 
Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
 
Beyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modelingBeyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modeling
 
Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Data analytics telecom churn final ppt
Data analytics telecom churn final ppt
 
Churn Predictive Modelling
Churn Predictive ModellingChurn Predictive Modelling
Churn Predictive Modelling
 
Replication Internals: The Life of a Write
Replication Internals: The Life of a WriteReplication Internals: The Life of a Write
Replication Internals: The Life of a Write
 
Online Games Analytics - Data Science for Fun
Online Games Analytics - Data Science for FunOnline Games Analytics - Data Science for Fun
Online Games Analytics - Data Science for Fun
 
Dataiku - google cloud platform roadshow - october 2013
Dataiku  - google cloud platform roadshow - october 2013Dataiku  - google cloud platform roadshow - october 2013
Dataiku - google cloud platform roadshow - october 2013
 
"Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ..."Machine Learning and Internet of Things, the future of medical prevention", ...
"Machine Learning and Internet of Things, the future of medical prevention", ...
 
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014Dataiku  - hadoop ecosystem - @Epitech Paris - janvier 2014
Dataiku - hadoop ecosystem - @Epitech Paris - janvier 2014
 
Eat whatever you can with PyBabe
Eat whatever you can with PyBabeEat whatever you can with PyBabe
Eat whatever you can with PyBabe
 
Dataiku pig - hive - cascading
Dataiku   pig - hive - cascadingDataiku   pig - hive - cascading
Dataiku pig - hive - cascading
 
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16thDataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16th
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
 

Plus de Dataiku

Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Dataiku
 
Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Dataiku
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelDataiku
 
The US Healthcare Industry
The US Healthcare IndustryThe US Healthcare Industry
The US Healthcare IndustryDataiku
 
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015Dataiku
 
Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team
Dataiku -  Big data paris 2015 - A Hybrid Platform, a Hybrid Team Dataiku -  Big data paris 2015 - A Hybrid Platform, a Hybrid Team
Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team Dataiku
 
The paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECHThe paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECHDataiku
 
OWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuOWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuDataiku
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku
 
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Dataiku
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...Dataiku
 
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku   big data paris - the rise of the hadoop ecosystemDataiku   big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystemDataiku
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages JaunesDataiku
 
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku  - for Data Geek Paris@Criteo - Close the Data CircleDataiku  - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku - for Data Geek Paris@Criteo - Close the Data CircleDataiku
 
Data Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from thData Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from thDataiku
 
Dataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku
 
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku
 

Plus de Dataiku (17)

Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
 
Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML model
 
The US Healthcare Industry
The US Healthcare IndustryThe US Healthcare Industry
The US Healthcare Industry
 
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
 
Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team
Dataiku -  Big data paris 2015 - A Hybrid Platform, a Hybrid Team Dataiku -  Big data paris 2015 - A Hybrid Platform, a Hybrid Team
Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team
 
The paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECHThe paradox of big data - dataiku / oxalide APEROTECH
The paradox of big data - dataiku / oxalide APEROTECH
 
OWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuOWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - Dataiku
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
 
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
 
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku   big data paris - the rise of the hadoop ecosystemDataiku   big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystem
 
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages JaunesBreizhJUG - Janvier 2014 - Big Data -  Dataiku - Pages Jaunes
BreizhJUG - Janvier 2014 - Big Data - Dataiku - Pages Jaunes
 
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku  - for Data Geek Paris@Criteo - Close the Data CircleDataiku  - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
 
Data Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from thData Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from th
 
Dataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine Learning
 
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin Buzzwords
 

Dernier

RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 

Dernier (20)

RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 

"How to Data Model Churn (real life examples)"

  • 1. How to data model Churn Real life examples
  • 2. Quick quizz •  How many of you are familiar with Churn issue? •  with Machine Learning? Logistic Regression, Random Forest, Gradient Boosting trees? (Not the subject here) •  With SQL? (we may see some code later) •  What database tech do you use? What about EMC Greenplum or Vertica?
  • 3. Who I am •  Senior Data Scientist at Dataiku (worked on churn prediction, fraud detection, bot detection, recommender systems, graph analytics, smart cities, … ) •  Occasional Kaggle competitor •  Mostly code with python and SQL •  Twitter @prrgutierrez
  • 4. Churn definition •  Wikipedia: “Churn rate (sometimes called attrition rate), in its broadest sense, is a measure of the number of individuals or items moving out of a collective group over a specific period of time” = Customer leaving
  • 5. Two types of Churn •  Subscription models: •  Telco •  E-gamming (Wow) •  Ex : Coyote -> 1 year subscription -> you know when someone leave •  Non subscription models: •  E-Business (Amazon, Price Minister, Vente Privée) •  E-gamming (Candy Crush, free MMORPG) -> you approximate someone leaving Candy Crush: days / weeks MMORPG: 2 months (holidays) Price Minister: months
  • 6. Two types of Churn •  Blurred Separation: •  Ex: T-mobile: 1 month subscription -> paying each call •  Ex: Wow: 1 month to 6 month subscription •  Banking? •  Focus : no subscription: •  Can be seen as a generalization where you have to approximate the target •  Bonus : Seller churn •  Market places •  Clients that participate product life •  Forums (Reddit) •  E-gamming (Korean competitions, guilds etc.)
  • 7. Dealing with churn •  Motivations : •  Saturated market -> cost get new client >>> cost keep client •  Ex : http://www.bain.com/publications/articles/breaking-the-back-of-customer-churn.aspx •  Wireline company : 2% to 2.5 % churn rate per month. •  If 5 M customers -> 1.32 M churn per year •  When reducing from 2.5% to 2% lowest estimation : 240 M $ in 18 month
  • 8. Dealing with churn •  Predict churn : •  One model for performance <- our focus, short term, more ML •  One model for understanding <- long term, more Analytics •  Act on it (short term) : •  Special offer (telco call, free in game money, discount coupon … ) •  Does it work? Feedback loop needed! •  Model probabilities of leaving because of offer. A/B tests. Multi arms Bandit? •  Significant LTV for activation? •  Act on it (long term) : •  Is there a problem in my purchasing funnel? •  Is the game too hard at some point?
  • 9. Dealing with churn •  Candy Crush Rumor : •  Change the distribution of probabilities of candies / bombs •  Change the difficulty of the game •  Loosing a lot makes the game easier
  • 10. Modelling Churn •  Machine learning model (classification) -> target: •  Known in subscription •  Unknown in general •  Step 1 : Maintain customer status •  Do you care only about your best? •  Anyway churn action won’t be the same •  Has a client churned? -> target = churner = don’t buy / visit since time X -> best = buy / visit more than y since time Y •  Can be refined (“new customer”, several class of best or inactive, reactivated…) •  Storage : maintain only the difference!
  • 11. Modelling Churn •  Machine learning model -> features: •  Explicative factors to use as input for the model •  Step 2 : Maintain customer features •  Social (woman, age, etc.) •  Behavioral! •  Utilization / buying rate •  Trend in utilization / buying rate •  Ad hoc features : •  WoW / Social game churn: take into account friend network churn •  Telco: call to call centers •  Beware of time dependence!
  • 14. Ex : Train and predict scheme Time   T  :  present  ,me  T  –  4  month   Data  is  used  for  target   crea,on  :  ac,vity  during   the  last  4  months   Data  is  used  for  feature   genera,on.   Use  model  to  predict   future  churn   Train  model  using  features  and  target  
  • 15. Ex : Train Evaluation and Predict Scheme Time   T  :  present  ,me  T  –  4  month   Data  is  used  for  target   crea,on  :  ac,vity  during   the  last  4  months   Data  is  used  for   feature  genera,on   Valida&on  set   Use  model  to   predict  future   churn   Training   Evaluate  on  the  target   of  the  valida,on  set   T  –  8  month   Data  is  used  for  features   genera,on.   Data  is  used  for  target   crea,on  :  ac,vity  during   the  last  4  months  
  • 16. Thank you for your attention !