SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
Introduction to Uplift Modelling
An online gaming application
A few words about me
•  Senior Data Scientist at Dataiku
(worked on churn prediction, fraud detection, bot detection, recommender systems, graph
analytics, smart cities, … )
•  Occasional Kaggle competitor
•  Mostly code with python and SQL
•  Twitter @prrgutierrez
Plan
•  Introduction / Client situation
•  Uplift use case examples
•  Uplift modeling
•  Uplift evaluation & results
Client situation
•  Ankama : French Online Gaming Company (RPG)
•  Users are leaving
•  let’s do a churn prediction model !
•  Target : no come back in 14 or 28 days.
(14 missing days -> 80 % of chance not to come back
28 missing days -> 90 % of chance not to come back)
•  Features :
•  Connection features :
•  Time played in 1,7,15,30,… days
•  Time since last connection
•  Connection frequency
•  Days of week / hours of days played
•  Equivalent for payments and subscriptions
•  Age, sex, country
•  Number of account, is a bot …
•  No in game features (no data)
	
  
	
  
Client situation
•  Model Results :
•  AUC 0.88
•  Very stable model
•  Marketing actions :
•  7 different actions based on customer segmentation
(offers, promotion, … )
•  A/B test
-> -5 % churn for persons contacted by email
•  Going further :
•  Feature engineering : guilds, close network, in game actions, …
•  Study long term churn …
Client situation
•  But wait !
•  Strong hypothesis : target the person that are the most likely to churn
Client situation
•  But wait !
•  Strong hypothesis : target the person that are the most likely to churn
•  What is the gain / person for an action ?
•  cost of action
•  value of the customer
•  independent variables
•  “treated” population and “control” population
• 
•  Value with action :
•  Value without action :
•  Gain (if independent of treatment ) :
c
vi i
X
T C
Y =
⇢
1 if customer churn
0 otherwise
ET
(Vi) = vi(1 PT
(Y = 1|X)) c
EC
(Vi) = vi(1 PC
(Y = 1|X))
vi
E(Gi) = vi(PC
(Y = 1|X) PT
(Y = 1|X)) c
Client situation
•  But wait !
•  Strong hypothesis : target the person that are the most likely to churn
•  What is the gain / person for an action ?
•  Objective : maximize this gain
•  Targeting highly probable churner -> minimize
But not the difference !
•  Intuitive examples :
•  : action is expected to make the situation worst. Spam ?
•  : user does not care, is already lost
Upli&	
  =	
  Model	
  
E(Gi) = vi(PC
(Y = 1|X) PT
(Y = 1|X)) c
PT
(Y = 1|X)
PC
(Y = 1) ⇡ PT
(Y = 1)
P
PC
(Y = 1) < PT
(Y = 1)
Uplift
•  Model effect of the action
•  4 groups of customers / patients
•  1  Responded because of the action
(the people we want)
•  2  Responded, but would have responded anyway
(unnecessary costs)
•  3  Did not respond and the action had no impact
(unnecessary costs)
•  4  Did not respond because the action had a negative impact
(negative impact)
•  Incomplete knowledge
Uplift Examples
•  Healthcare :
•  A typical medical trial:
•  treatment group: gets the treatment
•  control group: gets placebo (or another treatment)
•  do a statistical test to show that the treatment is better than placebo
•  With uplift modeling we can find out for whom the treatment works best
•  Personalized medicine
•  Ex : What is the gain in survival probability ?
-> classification/uplift problem
Uplift Examples
•  Churn :
•  E-gaming
•  Other Ex : Coyote
•  Retail :
•  Compare coupons campaigns
Uplift Examples
•  Mailing : Hillstrom challenge
•  2 campaigns :
•  one men email
•  one woman email
•  Question : who are the people to target / that have the best response rate
Uplift Examples
•  Common pattern
•  Experiment or A/B testing -> Test and control
•  Warning : Control can be biased easily :
•  Targeted most probable churners and control is the rest
•  Call only the people that come to a shop
•  Limited experiment trial -> no bandit algorithm :
(once a medicine experiment is done, you don’t continue the “exploration”)
-> relatively large and discrete in time feedbacks.
Uplift modelling
•  Three main methods :
•  Two models approach
•  Class variable modification
•  Modification of existing machine learning models
Uplift modelling : Two model approach
•  Build a model on treatment to get
•  Build a model on control to get
•  Set :
PT
(Y |X)
PC
(Y |X)
P = PT
(Y |X) PC
(Y |X)
Uplift modelling : Two model approach
•  Advantages :
•  Standard ML models can be used
•  In theory, two good estimators -> a good uplift model
•  Works well in practice
•  Generalize to regression and multi-treatment easily
•  Drawbacks
•  Difference of estimators is probably not the best estimator of the difference
•  The two classifier can ignore the weaker uplift signal (since it’s not their target)
•  Algorithm focusing on estimating the difference should perform better
Uplift modelling : Class variable modification
•  Introduced in Jaskowski, Jaroszewicz 2012
•  Allows any classifier to be updated to uplift modeling
•  Let denote the group membership (Treatment or Control)
•  Let’s define the new target variable :
•  This corresponds to flipping the target in the control dataset.
G 2 {T, C}
Z =
8
<
:
1 if G = T and Y = 1
1 if G = C and Y = 0
0 otherwise
Uplift modelling : Class variable modification
•  Why does it work ?
•  By design (A/B test warning !), should be independent from
•  Possibly with a reweighting of the datasets we should have :
thus
P(Z = 1|X) = PT
(Y = 1|X)P(G = T|X) + PC
(Y = 0|X)P(G = C|X)
P(Z = 1|X) = PT
(Y = 1|X)P(G = T) + PC
(Y = 0|X)P(G = C)
G X
P(G = T) = P(G = C) = 1/2
2P(Z = 1|X) = PT
(Y = 1|X) + PC
(Y = 0|X)
Uplift modelling : Class variable modification
•  Why does it work ?
Thus
And sorting by is the same as sorting by
2P(Z = 1|X) = PT
(Y = 1|X) + PC
(Y = 0|X)
= PT
(Y = 1|X) + 1 PC
(Y = 1|X)
P = 2P(Z = 1|X) 1
P(Z = 1|X) P
Uplift modelling : Class variable modification
•  Summary :
•  Flip class for control dataset
•  Concatenate test and control dataset
•  Build a classifier
•  Target users with highest probability
•  Advantages :
•  Any classifier can be used
•  Directly predict uplift (and not each class separately)
•  Single model on a larger dataset (instead of two small ones)
•  Drawbacks :
•  Complex decision surface -> model can perform poorly
•  Interpretation : what is AUC in this case ?
Uplift modeling : Other methods
•  Based on decision trees :
•  Rzepakowski Jaroszewicz 2012
new decision tree split criterion based on information theory
•  Soltys Rzepakowski Jaroszewicz 2013
Ensemble methods for uplift modeling
(out of today scope)
Evaluation
•  We used :
•  2 model approach. -> AUC ? Not very informative.
•  1 model approach -> does AUC means something ?
•  How can we evaluate / compare them ?
•  Cross Validation :
•  4 datasets : treatment/control x train/test
•  Problem :
•  We don’t have a clear 0/1 target.
•  We would need to know for each customer
•  Response to treatment
•  Response to control
-> not possible
Evaluation
•  Gain for group of customers :
•  Gain for the 10% highest scoring customers =
% of successes for top 10% treated customers − % of successes for top 10% control
customers
•  Uplift curve ? :
•  Difference between two lift curve
•  Interpretation : net gain in success rate if a given percentage of the population is treated
•  Pb : no theoretic maximum
•  Pb 2 : weird behaviour for 2 wizard models.
Evaluation : Qini
•  Qini Measure :
•  Similar to Gini (Area under lift curve). Lift Curve <-> Qini Curve
•  Parametric curve defined by :
•  When taking the first observations
•  is the total number of 1 seen in target observations
•  is the total number of 1 seen in control observations
•  is the total number of target observations
•  is the total number of control observations
•  Balanced setting :
t
f(t) = YT (t) YC(t) ⇤ NC(t)/NT (t)
YT
YC
NC
NT
f(t) = YT (t) YC(t)
Evaluation : Qini
•  Personal intuition :
•  We can’t know everything :
•  treated that convert, not treated that don’t convert. What would have happen ?
•  But we don’t want to see :
•  Treated not converting
•  Not treated converting (in our top list)
•  In we want to minimize :
•  Very similar to lift taking into account only negative examples.
t
NT (t) YT (t) + YC(t)
Evaluation : Qini
f(t) = YT (t) YC(t)
Evaluation : Qini
•  Best model :
•  Take first all positive in target and last all positive in control.
•  No theoretic best model :
•  depends on possibility of negative effect
•  Displayed for no negative effect
•  Random model :
•  Corresponds to global effect of treatment
•  Hillstrom Dataset :
•  For women models are comparable and useful
•  For men, there is no clear individuals to target
Evaluation : Qini
f(t) = YT (t) YC(t)
Evaluation : Qini
•  Back to our study :
•  Class modification performs best
•  Two models approach performs poorly
•  A/B test problem :
•  Control dataset is way to small !
•  Class modification model very close to lift
•  Two model slightly better than random
-> would need to redo the A/B test.
Conclusion
•  Uplift :
•  Surprisingly little literature / examples
•  The theory is rather easy to test
•  Two models
•  Class modification
•  The intuition and evaluation are not easy to grasp
•  On the client side :
•  A good lead to select the best offer for a customer
A few references
•  Data :
•  Churn in gaming :
WOWAH dataset (blog post to come)
•  Uplift for healthcare :
Colon Dataset
•  Uplift in mailing :
Hillstrom data challenge
•  Uplift in General :
Simulated data :
(blog post to come)
A few references
•  Application
•  Uplift modeling for clinical trial data (Jaskowski, Jaroszewicz)
•  Uplift Modeling in Direct Marketing (Rzepakowski, Jaroszewicz)
A few references
•  Modeling techniques :
•  Rzepakowski Jaroszewicz 2011 (decision trees)
•  Soltys Rzepakowski Jaroszewicz 2013 (ensemble for uplift)
•  Jaskowski Jaroszewicz 2012 (Class modification model)
A few references
•  Evaluation
•  Using Control Groups to Target on Predicted Lift (Radcliffe)
•  Testing a New Metric for Uplift Models (Mesalles Naranjo)
Thank you for your attention !

Contenu connexe

En vedette

Dataveyes data viz fvga meet-up_janvier 2014
Dataveyes data viz fvga meet-up_janvier 2014Dataveyes data viz fvga meet-up_janvier 2014
Dataveyes data viz fvga meet-up_janvier 2014
Johan-André Jeanville
 
Thibault coupart data viz avec tableau fvga meet-up_janvier 2014
Thibault coupart data viz avec tableau fvga meet-up_janvier 2014Thibault coupart data viz avec tableau fvga meet-up_janvier 2014
Thibault coupart data viz avec tableau fvga meet-up_janvier 2014
Johan-André Jeanville
 
François guillem data viz avec r fvga meet-up_janvier 2014
François guillem data viz avec r fvga meet-up_janvier 2014François guillem data viz avec r fvga meet-up_janvier 2014
François guillem data viz avec r fvga meet-up_janvier 2014
Johan-André Jeanville
 

En vedette (7)

Meetup_FVGA_User_acquisition_Ankama_Ingrid Florin Muller
Meetup_FVGA_User_acquisition_Ankama_Ingrid Florin MullerMeetup_FVGA_User_acquisition_Ankama_Ingrid Florin Muller
Meetup_FVGA_User_acquisition_Ankama_Ingrid Florin Muller
 
Meetup_FGVA_Prédiction et prévention du churn @ Ankama
Meetup_FGVA_Prédiction et prévention du churn @ Ankama Meetup_FGVA_Prédiction et prévention du churn @ Ankama
Meetup_FGVA_Prédiction et prévention du churn @ Ankama
 
Dataveyes data viz fvga meet-up_janvier 2014
Dataveyes data viz fvga meet-up_janvier 2014Dataveyes data viz fvga meet-up_janvier 2014
Dataveyes data viz fvga meet-up_janvier 2014
 
Thibault coupart data viz avec tableau fvga meet-up_janvier 2014
Thibault coupart data viz avec tableau fvga meet-up_janvier 2014Thibault coupart data viz avec tableau fvga meet-up_janvier 2014
Thibault coupart data viz avec tableau fvga meet-up_janvier 2014
 
François guillem data viz avec r fvga meet-up_janvier 2014
François guillem data viz avec r fvga meet-up_janvier 2014François guillem data viz avec r fvga meet-up_janvier 2014
François guillem data viz avec r fvga meet-up_janvier 2014
 
Data analysis & balancing meeting thibault coupart avril 2015
Data analysis & balancing meeting thibault coupart avril 2015Data analysis & balancing meeting thibault coupart avril 2015
Data analysis & balancing meeting thibault coupart avril 2015
 
Meetup_FVGA_Mobile_User Acquisition_Addict_Mobile_Gregoire_Mercier
Meetup_FVGA_Mobile_User Acquisition_Addict_Mobile_Gregoire_MercierMeetup_FVGA_Mobile_User Acquisition_Addict_Mobile_Gregoire_Mercier
Meetup_FVGA_Mobile_User Acquisition_Addict_Mobile_Gregoire_Mercier
 

Similaire à Meetup_FGVA_Uplift @ Dataiku

Similaire à Meetup_FGVA_Uplift @ Dataiku (20)

ABTest-20231020.pptx
ABTest-20231020.pptxABTest-20231020.pptx
ABTest-20231020.pptx
 
Pp ts for machine learning
Pp ts for machine learningPp ts for machine learning
Pp ts for machine learning
 
Causality without headaches
Causality without headachesCausality without headaches
Causality without headaches
 
Statistical hypothesis testing in e commerce
Statistical hypothesis testing in e commerceStatistical hypothesis testing in e commerce
Statistical hypothesis testing in e commerce
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to Statistics
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
 
Metrics in Security Operations
Metrics in Security OperationsMetrics in Security Operations
Metrics in Security Operations
 
EMOD_Optimization_Presentation.pptx
EMOD_Optimization_Presentation.pptxEMOD_Optimization_Presentation.pptx
EMOD_Optimization_Presentation.pptx
 
Tagauchi method
Tagauchi methodTagauchi method
Tagauchi method
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
Deep Q-learning from Demonstrations DQfD
Deep Q-learning from Demonstrations DQfDDeep Q-learning from Demonstrations DQfD
Deep Q-learning from Demonstrations DQfD
 
An introduction to machine learning and statistics
An introduction to machine learning and statisticsAn introduction to machine learning and statistics
An introduction to machine learning and statistics
 
Setting up an A/B-testing framework
Setting up an A/B-testing frameworkSetting up an A/B-testing framework
Setting up an A/B-testing framework
 
Test Cases - are they dead?
Test Cases - are they dead?Test Cases - are they dead?
Test Cases - are they dead?
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Causal reasoning and Learning Systems
Causal reasoning and Learning SystemsCausal reasoning and Learning Systems
Causal reasoning and Learning Systems
 
Common mistakes in measurement uncertainty calculations
Common mistakes in measurement uncertainty calculationsCommon mistakes in measurement uncertainty calculations
Common mistakes in measurement uncertainty calculations
 
Taguchi method
Taguchi methodTaguchi method
Taguchi method
 
مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
 
DRL #2-3 - Multi-Armed Bandits .pptx.pdf
DRL #2-3 - Multi-Armed Bandits .pptx.pdfDRL #2-3 - Multi-Armed Bandits .pptx.pdf
DRL #2-3 - Multi-Armed Bandits .pptx.pdf
 

Plus de Johan-André Jeanville

LudoStat_La bonne analyse au bon moment
LudoStat_La bonne analyse au bon momentLudoStat_La bonne analyse au bon moment
LudoStat_La bonne analyse au bon moment
Johan-André Jeanville
 
Modeling of players_activity_michel pierfitte_ubisoft_septembre 2013
Modeling of players_activity_michel pierfitte_ubisoft_septembre 2013Modeling of players_activity_michel pierfitte_ubisoft_septembre 2013
Modeling of players_activity_michel pierfitte_ubisoft_septembre 2013
Johan-André Jeanville
 

Plus de Johan-André Jeanville (11)

LudoStat_La bonne analyse au bon moment
LudoStat_La bonne analyse au bon momentLudoStat_La bonne analyse au bon moment
LudoStat_La bonne analyse au bon moment
 
Modeling of players_activity_michel pierfitte_ubisoft_septembre 2013
Modeling of players_activity_michel pierfitte_ubisoft_septembre 2013Modeling of players_activity_michel pierfitte_ubisoft_septembre 2013
Modeling of players_activity_michel pierfitte_ubisoft_septembre 2013
 
Conférence Laboratoire des Mondes Virtuels_Altana_La réglementation des donné...
Conférence Laboratoire des Mondes Virtuels_Altana_La réglementation des donné...Conférence Laboratoire des Mondes Virtuels_Altana_La réglementation des donné...
Conférence Laboratoire des Mondes Virtuels_Altana_La réglementation des donné...
 
Conférence Laboratoire des Mondes Virtuels_Dataiku_Choix technologiques pour ...
Conférence Laboratoire des Mondes Virtuels_Dataiku_Choix technologiques pour ...Conférence Laboratoire des Mondes Virtuels_Dataiku_Choix technologiques pour ...
Conférence Laboratoire des Mondes Virtuels_Dataiku_Choix technologiques pour ...
 
Conférence Laboratoire des Mondes Virtuels__LudoStat_Choisir ses KPIS et mett...
Conférence Laboratoire des Mondes Virtuels__LudoStat_Choisir ses KPIS et mett...Conférence Laboratoire des Mondes Virtuels__LudoStat_Choisir ses KPIS et mett...
Conférence Laboratoire des Mondes Virtuels__LudoStat_Choisir ses KPIS et mett...
 
Conférence Laboratoire des Mondes Virtuels_ Ico Partners_Best practices pour ...
Conférence Laboratoire des Mondes Virtuels_ Ico Partners_Best practices pour ...Conférence Laboratoire des Mondes Virtuels_ Ico Partners_Best practices pour ...
Conférence Laboratoire des Mondes Virtuels_ Ico Partners_Best practices pour ...
 
Conférence Laboratoire des Mondes Virtuels__Microsoft
Conférence Laboratoire des Mondes Virtuels__MicrosoftConférence Laboratoire des Mondes Virtuels__Microsoft
Conférence Laboratoire des Mondes Virtuels__Microsoft
 
Conférence Laboratoire des Mondes Virtuels__Milky interactive
Conférence Laboratoire des Mondes Virtuels__Milky interactiveConférence Laboratoire des Mondes Virtuels__Milky interactive
Conférence Laboratoire des Mondes Virtuels__Milky interactive
 
Conférence Laboratoire des Mondes Virtuels_Be Tomorrow_Data mining with Phoen...
Conférence Laboratoire des Mondes Virtuels_Be Tomorrow_Data mining with Phoen...Conférence Laboratoire des Mondes Virtuels_Be Tomorrow_Data mining with Phoen...
Conférence Laboratoire des Mondes Virtuels_Be Tomorrow_Data mining with Phoen...
 
Conférence Laboratoire des Mondes Virtuels_Introduction
Conférence Laboratoire des Mondes Virtuels_IntroductionConférence Laboratoire des Mondes Virtuels_Introduction
Conférence Laboratoire des Mondes Virtuels_Introduction
 
Conférence Laboratoire des Mondes Virtuels_Capital Games_Présentation mission...
Conférence Laboratoire des Mondes Virtuels_Capital Games_Présentation mission...Conférence Laboratoire des Mondes Virtuels_Capital Games_Présentation mission...
Conférence Laboratoire des Mondes Virtuels_Capital Games_Présentation mission...
 

Dernier

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
amitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Dernier (20)

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

Meetup_FGVA_Uplift @ Dataiku

  • 1. Introduction to Uplift Modelling An online gaming application
  • 2. A few words about me •  Senior Data Scientist at Dataiku (worked on churn prediction, fraud detection, bot detection, recommender systems, graph analytics, smart cities, … ) •  Occasional Kaggle competitor •  Mostly code with python and SQL •  Twitter @prrgutierrez
  • 3. Plan •  Introduction / Client situation •  Uplift use case examples •  Uplift modeling •  Uplift evaluation & results
  • 4. Client situation •  Ankama : French Online Gaming Company (RPG) •  Users are leaving •  let’s do a churn prediction model ! •  Target : no come back in 14 or 28 days. (14 missing days -> 80 % of chance not to come back 28 missing days -> 90 % of chance not to come back) •  Features : •  Connection features : •  Time played in 1,7,15,30,… days •  Time since last connection •  Connection frequency •  Days of week / hours of days played •  Equivalent for payments and subscriptions •  Age, sex, country •  Number of account, is a bot … •  No in game features (no data)    
  • 5. Client situation •  Model Results : •  AUC 0.88 •  Very stable model •  Marketing actions : •  7 different actions based on customer segmentation (offers, promotion, … ) •  A/B test -> -5 % churn for persons contacted by email •  Going further : •  Feature engineering : guilds, close network, in game actions, … •  Study long term churn …
  • 6. Client situation •  But wait ! •  Strong hypothesis : target the person that are the most likely to churn
  • 7. Client situation •  But wait ! •  Strong hypothesis : target the person that are the most likely to churn •  What is the gain / person for an action ? •  cost of action •  value of the customer •  independent variables •  “treated” population and “control” population •  •  Value with action : •  Value without action : •  Gain (if independent of treatment ) : c vi i X T C Y = ⇢ 1 if customer churn 0 otherwise ET (Vi) = vi(1 PT (Y = 1|X)) c EC (Vi) = vi(1 PC (Y = 1|X)) vi E(Gi) = vi(PC (Y = 1|X) PT (Y = 1|X)) c
  • 8. Client situation •  But wait ! •  Strong hypothesis : target the person that are the most likely to churn •  What is the gain / person for an action ? •  Objective : maximize this gain •  Targeting highly probable churner -> minimize But not the difference ! •  Intuitive examples : •  : action is expected to make the situation worst. Spam ? •  : user does not care, is already lost Upli&  =  Model   E(Gi) = vi(PC (Y = 1|X) PT (Y = 1|X)) c PT (Y = 1|X) PC (Y = 1) ⇡ PT (Y = 1) P PC (Y = 1) < PT (Y = 1)
  • 9. Uplift •  Model effect of the action •  4 groups of customers / patients •  1  Responded because of the action (the people we want) •  2  Responded, but would have responded anyway (unnecessary costs) •  3  Did not respond and the action had no impact (unnecessary costs) •  4  Did not respond because the action had a negative impact (negative impact) •  Incomplete knowledge
  • 10. Uplift Examples •  Healthcare : •  A typical medical trial: •  treatment group: gets the treatment •  control group: gets placebo (or another treatment) •  do a statistical test to show that the treatment is better than placebo •  With uplift modeling we can find out for whom the treatment works best •  Personalized medicine •  Ex : What is the gain in survival probability ? -> classification/uplift problem
  • 11. Uplift Examples •  Churn : •  E-gaming •  Other Ex : Coyote •  Retail : •  Compare coupons campaigns
  • 12. Uplift Examples •  Mailing : Hillstrom challenge •  2 campaigns : •  one men email •  one woman email •  Question : who are the people to target / that have the best response rate
  • 13. Uplift Examples •  Common pattern •  Experiment or A/B testing -> Test and control •  Warning : Control can be biased easily : •  Targeted most probable churners and control is the rest •  Call only the people that come to a shop •  Limited experiment trial -> no bandit algorithm : (once a medicine experiment is done, you don’t continue the “exploration”) -> relatively large and discrete in time feedbacks.
  • 14. Uplift modelling •  Three main methods : •  Two models approach •  Class variable modification •  Modification of existing machine learning models
  • 15. Uplift modelling : Two model approach •  Build a model on treatment to get •  Build a model on control to get •  Set : PT (Y |X) PC (Y |X) P = PT (Y |X) PC (Y |X)
  • 16. Uplift modelling : Two model approach •  Advantages : •  Standard ML models can be used •  In theory, two good estimators -> a good uplift model •  Works well in practice •  Generalize to regression and multi-treatment easily •  Drawbacks •  Difference of estimators is probably not the best estimator of the difference •  The two classifier can ignore the weaker uplift signal (since it’s not their target) •  Algorithm focusing on estimating the difference should perform better
  • 17. Uplift modelling : Class variable modification •  Introduced in Jaskowski, Jaroszewicz 2012 •  Allows any classifier to be updated to uplift modeling •  Let denote the group membership (Treatment or Control) •  Let’s define the new target variable : •  This corresponds to flipping the target in the control dataset. G 2 {T, C} Z = 8 < : 1 if G = T and Y = 1 1 if G = C and Y = 0 0 otherwise
  • 18. Uplift modelling : Class variable modification •  Why does it work ? •  By design (A/B test warning !), should be independent from •  Possibly with a reweighting of the datasets we should have : thus P(Z = 1|X) = PT (Y = 1|X)P(G = T|X) + PC (Y = 0|X)P(G = C|X) P(Z = 1|X) = PT (Y = 1|X)P(G = T) + PC (Y = 0|X)P(G = C) G X P(G = T) = P(G = C) = 1/2 2P(Z = 1|X) = PT (Y = 1|X) + PC (Y = 0|X)
  • 19. Uplift modelling : Class variable modification •  Why does it work ? Thus And sorting by is the same as sorting by 2P(Z = 1|X) = PT (Y = 1|X) + PC (Y = 0|X) = PT (Y = 1|X) + 1 PC (Y = 1|X) P = 2P(Z = 1|X) 1 P(Z = 1|X) P
  • 20. Uplift modelling : Class variable modification •  Summary : •  Flip class for control dataset •  Concatenate test and control dataset •  Build a classifier •  Target users with highest probability •  Advantages : •  Any classifier can be used •  Directly predict uplift (and not each class separately) •  Single model on a larger dataset (instead of two small ones) •  Drawbacks : •  Complex decision surface -> model can perform poorly •  Interpretation : what is AUC in this case ?
  • 21. Uplift modeling : Other methods •  Based on decision trees : •  Rzepakowski Jaroszewicz 2012 new decision tree split criterion based on information theory •  Soltys Rzepakowski Jaroszewicz 2013 Ensemble methods for uplift modeling (out of today scope)
  • 22. Evaluation •  We used : •  2 model approach. -> AUC ? Not very informative. •  1 model approach -> does AUC means something ? •  How can we evaluate / compare them ? •  Cross Validation : •  4 datasets : treatment/control x train/test •  Problem : •  We don’t have a clear 0/1 target. •  We would need to know for each customer •  Response to treatment •  Response to control -> not possible
  • 23. Evaluation •  Gain for group of customers : •  Gain for the 10% highest scoring customers = % of successes for top 10% treated customers − % of successes for top 10% control customers •  Uplift curve ? : •  Difference between two lift curve •  Interpretation : net gain in success rate if a given percentage of the population is treated •  Pb : no theoretic maximum •  Pb 2 : weird behaviour for 2 wizard models.
  • 24. Evaluation : Qini •  Qini Measure : •  Similar to Gini (Area under lift curve). Lift Curve <-> Qini Curve •  Parametric curve defined by : •  When taking the first observations •  is the total number of 1 seen in target observations •  is the total number of 1 seen in control observations •  is the total number of target observations •  is the total number of control observations •  Balanced setting : t f(t) = YT (t) YC(t) ⇤ NC(t)/NT (t) YT YC NC NT f(t) = YT (t) YC(t)
  • 25. Evaluation : Qini •  Personal intuition : •  We can’t know everything : •  treated that convert, not treated that don’t convert. What would have happen ? •  But we don’t want to see : •  Treated not converting •  Not treated converting (in our top list) •  In we want to minimize : •  Very similar to lift taking into account only negative examples. t NT (t) YT (t) + YC(t)
  • 26. Evaluation : Qini f(t) = YT (t) YC(t)
  • 27. Evaluation : Qini •  Best model : •  Take first all positive in target and last all positive in control. •  No theoretic best model : •  depends on possibility of negative effect •  Displayed for no negative effect •  Random model : •  Corresponds to global effect of treatment •  Hillstrom Dataset : •  For women models are comparable and useful •  For men, there is no clear individuals to target
  • 28. Evaluation : Qini f(t) = YT (t) YC(t)
  • 29. Evaluation : Qini •  Back to our study : •  Class modification performs best •  Two models approach performs poorly •  A/B test problem : •  Control dataset is way to small ! •  Class modification model very close to lift •  Two model slightly better than random -> would need to redo the A/B test.
  • 30. Conclusion •  Uplift : •  Surprisingly little literature / examples •  The theory is rather easy to test •  Two models •  Class modification •  The intuition and evaluation are not easy to grasp •  On the client side : •  A good lead to select the best offer for a customer
  • 31. A few references •  Data : •  Churn in gaming : WOWAH dataset (blog post to come) •  Uplift for healthcare : Colon Dataset •  Uplift in mailing : Hillstrom data challenge •  Uplift in General : Simulated data : (blog post to come)
  • 32. A few references •  Application •  Uplift modeling for clinical trial data (Jaskowski, Jaroszewicz) •  Uplift Modeling in Direct Marketing (Rzepakowski, Jaroszewicz)
  • 33. A few references •  Modeling techniques : •  Rzepakowski Jaroszewicz 2011 (decision trees) •  Soltys Rzepakowski Jaroszewicz 2013 (ensemble for uplift) •  Jaskowski Jaroszewicz 2012 (Class modification model)
  • 34. A few references •  Evaluation •  Using Control Groups to Target on Predicted Lift (Radcliffe) •  Testing a New Metric for Uplift Models (Mesalles Naranjo)
  • 35. Thank you for your attention !