Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
A/B Testing In The Real World
How to run experiments in an “offline” setting
Elena Grewal
2014-04-09
Big Data Innovation S...
The Plan
2
1
2
“Offline” experiments: what and why?
Some experiment pitfalls and advice
3 Conclusions
But first, you might ask:
What is Airbnb?
Airbnb is an online marketplace for accommodations
4
Part of the “sharing economy”
Search in
San Francisco
5
Come Stay
In My Home!
6
That looks like a website.
What do we mean by “offline”?
Guest Journey
Host Journey
Offline Operations
Departments
+ Customer Support
+ Local Operations
+ Professional Photography
+ Many others…
10
Customer Support
Local Ops Teams
Photography
!
+ 3,000 Photographers worldwide
+ Over 100k listings photographed
+ Almost 2 million professional photos
13
Stepping back
14
+ Many companies have offline operations
+ Can optimize these using experiments
!
!
!
Online Experiments:
...
Why Do We Need Experiments?
Before and after won’t work
16
• Often very little data before professional photos are added
• Seasonality and other confo...
Selection bias often impacts analysis
17
• Listings that opt to get professional photography are not the
same as listings ...
Without an experiment, we don’t know the causal effect
18
This is the same reason we need online experiments
Date
01−01 01...
Traditional A/B Testing Online
Great sources:
http://mcfunley.com/design-for-continuous-experimentation
http://www.evanmil...
-5%
-4%
-3%
-2%
-1%
0%
1%
2%
3%
4%
5%
0 4 8 12 16 20 24 28 32 36
Delta
Treatment Effect for Price Filter Experiment
Initia...
-5%
-4%
-3%
-2%
-1%
0%
1%
2%
3%
4%
5%
0 4 8 12 16 20 24 28 32 36
Delta
Treatment Effect for Price Filter Experiment
Actual...
Offline Experiment Examples
Professional Photography
Let’s run an experiment!
23
More bookings?
Beware of Cannibalization
The unit of randomization depends on the effect we want to estimate
24
!
!
Local Operations: Market Level Experiment
25
!
+ Smaller “long tail” markets < 100 reviewed listings
Randomize Markets
93 ...
Market Distribution U.S. & Europe
26
Finding: Local Ops Efforts Have Positive Impact on Growth
27
Active Listings
Control

17% Growth
Local Ops Kickoff
Treatmen...
Case Study: Campos do Jordão, BR
28
+ Market grew 9x
+ Over 90% of the new listings are from new users
+ Low CPA
+ Primary...
Host Education
Improving listings through outreach
29
+ Initially not launched as an experiment and found positive impact
...
Some takeaways
Use context to improve operations
30
+ Can investigate heterogeneity in treatment effects with higher N
+ W...
Compare entire
treatment to
entire control
31
!
Treatment

!
!
!
Control

!
!
Called
vs.
Additional Offline vs. Online Considerations
32
+ Opt-in biases
+ You know you are in an experiment (Hawthorne/John Henry ...
Takeaways
+ Controlled experiments are the way to go if you want to make causal inference
+ Use them to optimize operation...
!
!
Questions?
!
!
@elenatej
elena.grewal@airbnb.com
!
we’re hiring: www.airbnb.com/jobs
Airbnb offline experiments
Prochain SlideShare
Chargement dans…5
×

Airbnb offline experiments

5 365 vues

Publié le

Elena Grewal presented these slides on a/b testing in the real world (offline experiments not online) at the Big Data Innovation Summit on April 9, 2014.

Publié dans : Données & analyses
  • Soyez le premier à commenter

Airbnb offline experiments

  1. 1. A/B Testing In The Real World How to run experiments in an “offline” setting Elena Grewal 2014-04-09 Big Data Innovation Summit
  2. 2. The Plan 2 1 2 “Offline” experiments: what and why? Some experiment pitfalls and advice 3 Conclusions
  3. 3. But first, you might ask: What is Airbnb?
  4. 4. Airbnb is an online marketplace for accommodations 4 Part of the “sharing economy”
  5. 5. Search in San Francisco 5
  6. 6. Come Stay In My Home! 6
  7. 7. That looks like a website. What do we mean by “offline”?
  8. 8. Guest Journey
  9. 9. Host Journey
  10. 10. Offline Operations Departments + Customer Support + Local Operations + Professional Photography + Many others… 10
  11. 11. Customer Support
  12. 12. Local Ops Teams
  13. 13. Photography ! + 3,000 Photographers worldwide + Over 100k listings photographed + Almost 2 million professional photos 13
  14. 14. Stepping back 14 + Many companies have offline operations + Can optimize these using experiments ! ! ! Online Experiments: We run these all the time too. If you are curious about on our online experimentation see Jan Overgoor’s tech talk http://nerds.airbnb.com/tech-talks/
  15. 15. Why Do We Need Experiments?
  16. 16. Before and after won’t work 16 • Often very little data before professional photos are added • Seasonality and other confounding factors bias results
  17. 17. Selection bias often impacts analysis 17 • Listings that opt to get professional photography are not the same as listings that do not get photography
  18. 18. Without an experiment, we don’t know the causal effect 18 This is the same reason we need online experiments Date 01−01 01−15 02−01 02−15 03−01 03−15 Product Launch Product Rollback Launch initiative: e.g. Offered Free Professional Photography
  19. 19. Traditional A/B Testing Online Great sources: http://mcfunley.com/design-for-continuous-experimentation http://www.evanmiller.org/how-not-to-run-an-ab-test.html Control Treatment 19
  20. 20. -5% -4% -3% -2% -1% 0% 1% 2% 3% 4% 5% 0 4 8 12 16 20 24 28 32 36 Delta Treatment Effect for Price Filter Experiment Initial Results Look Good 20 Δ > 0 : “positive” 0.00 0.10 0.20 0.30 0.40 0 4 8 12 16 20 24 28 32 36 p-value Days since start of experiment P-Value p < 0.05 : “significant”
  21. 21. -5% -4% -3% -2% -1% 0% 1% 2% 3% 4% 5% 0 4 8 12 16 20 24 28 32 36 Delta Treatment Effect for Price Filter Experiment Actually, Neutral Statistical significance by itself does not tell the whole story p = 0.4 : “noise” Δ = 0 : “neutral” 21 0.00 0.10 0.20 0.30 0.40 0 4 8 12 16 20 24 28 32 36 p-value Days since start of experiment P-Value p < 0.05 : “significant”
  22. 22. Offline Experiment Examples
  23. 23. Professional Photography Let’s run an experiment! 23 More bookings?
  24. 24. Beware of Cannibalization The unit of randomization depends on the effect we want to estimate 24 ! !
  25. 25. Local Operations: Market Level Experiment 25 ! + Smaller “long tail” markets < 100 reviewed listings Randomize Markets 93 Treatment / 92 Control Assess impact of operational strategy on market growth + Statistically measure the lift due to local ops teams + Measuring active listings, hosts, reviewed listings, and bookings
  26. 26. Market Distribution U.S. & Europe 26
  27. 27. Finding: Local Ops Efforts Have Positive Impact on Growth 27 Active Listings Control
 17% Growth Local Ops Kickoff Treatment 31% Growth
  28. 28. Case Study: Campos do Jordão, BR 28 + Market grew 9x + Over 90% of the new listings are from new users + Low CPA + Primary approach is phone sales + Other approaches were less successful + 862% + 7% Use qualitative research to understand what happened Active Listing Growth Treatment Control
  29. 29. Host Education Improving listings through outreach 29 + Initially not launched as an experiment and found positive impact + Launched as an experiment and found neutral impact + Don’t need market level approach here! !
  30. 30. Some takeaways Use context to improve operations 30 + Can investigate heterogeneity in treatment effects with higher N + Word of caution: can’t just compare those who were reached by a call or email to the control (selection bias strikes again)
  31. 31. Compare entire treatment to entire control 31 ! Treatment ! ! ! Control ! ! Called vs.
  32. 32. Additional Offline vs. Online Considerations 32 + Opt-in biases + You know you are in an experiment (Hawthorne/John Henry effects) + Monetary incentives impact external validity, trade-off take-up rate + Takes time to adjust to a change + Sample size may be limited by ops capacity + Stakeholders may be less data-savvy + Real people delivering the experiment! + Ethical considerations ! Always partner with customer support. !
  33. 33. Takeaways + Controlled experiments are the way to go if you want to make causal inference + Use them to optimize operations! ! but: + Level of randomization - what impact do you want to measure? + Cannibalization + Compare the right groups - no selection bias + Break down results to get the most from the analysis + Be practical/ethical - you are dealing with real people here 33
  34. 34. ! ! Questions? ! ! @elenatej elena.grewal@airbnb.com ! we’re hiring: www.airbnb.com/jobs

×