Airbnb offline experiments

A/B Testing In The Real World
How to run experiments in an “offline” setting
Elena Grewal
2014-04-09
Big Data Innovation Summit

The Plan
2
1
2
“Ofﬂine” experiments: what and why?
Some experiment pitfalls and advice
3 Conclusions

But ﬁrst, you might ask:
What is Airbnb?

Airbnb is an online marketplace for accommodations
4
Part of the “sharing economy”

That looks like a website.
What do we mean by “ofﬂine”?

Offline Operations
Departments
+ Customer Support
+ Local Operations
+ Professional Photography
+ Many others…
10

Photography
!
+ 3,000 Photographers worldwide
+ Over 100k listings photographed
+ Almost 2 million professional photos
13

Stepping back
14
+ Many companies have ofﬂine operations
+ Can optimize these using experiments
!
!
!
Online Experiments:
We run these all the time too.
If you are curious about on our online experimentation see Jan Overgoor’s tech talk
http://nerds.airbnb.com/tech-talks/

Before and after won’t work
16
• Often very little data before professional photos are added
• Seasonality and other confounding factors bias results

Selection bias often impacts analysis
17
• Listings that opt to get professional photography are not the
same as listings that do not get photography

Without an experiment, we don’t know the causal effect
18
This is the same reason we need online experiments
Date
01−01 01−15 02−01 02−15 03−01 03−15
Product Launch
Product Rollback
Launch initiative:
e.g. Offered Free Professional Photography

Traditional A/B Testing Online
Great sources:
http://mcfunley.com/design-for-continuous-experimentation
http://www.evanmiller.org/how-not-to-run-an-ab-test.html
Control Treatment
19

-5%
-4%
-3%
-2%
-1%
0%
1%
2%
3%
4%
5%
0 4 8 12 16 20 24 28 32 36
Delta
Treatment Effect for Price Filter Experiment
Initial Results Look Good
20
Δ > 0 : “positive”
0.00
0.10
0.20
0.30
0.40
0 4 8 12 16 20 24 28 32 36
p-value
Days since start of experiment
P-Value
p < 0.05 : “signiﬁcant”

-5%
-4%
-3%
-2%
-1%
0%
1%
2%
3%
4%
5%
0 4 8 12 16 20 24 28 32 36
Delta
Treatment Effect for Price Filter Experiment
Actually, Neutral
Statistical signiﬁcance by itself does not tell the whole story
p = 0.4 : “noise”
Δ = 0 : “neutral”
21
0.00
0.10
0.20
0.30
0.40
0 4 8 12 16 20 24 28 32 36
p-value
Days since start of experiment
P-Value
p < 0.05 : “signiﬁcant”

Professional Photography
Let’s run an experiment!
23
More bookings?

Beware of Cannibalization
The unit of randomization depends on the effect we want to estimate
24
!
!

Local Operations: Market Level Experiment
25
!
+ Smaller “long tail” markets < 100 reviewed listings
Randomize Markets
93 Treatment / 92 Control
Assess impact of operational strategy on market growth
+ Statistically measure the lift due to local ops teams
+ Measuring active listings, hosts, reviewed listings, and
bookings

Market Distribution U.S. & Europe
26

Finding: Local Ops Efforts Have Positive Impact on Growth
27
Active Listings
Control 
17% Growth
Local Ops Kickoﬀ
Treatment
31% Growth

Case Study: Campos do Jordão, BR
28
+ Market grew 9x
+ Over 90% of the new listings are from new users
+ Low CPA
+ Primary approach is phone sales
+ Other approaches were less successful
+ 862%
+ 7%
Use qualitative research to understand what happened
Active Listing Growth
Treatment
Control

Host Education
Improving listings through outreach
29
+ Initially not launched as an experiment and found positive impact
+ Launched as an experiment and found neutral impact
+ Don’t need market level approach here!
!

Some takeaways
Use context to improve operations
30
+ Can investigate heterogeneity in treatment effects with higher N
+ Word of caution: can’t just compare those who were reached
by a call or email to the control (selection bias strikes again)

Compare entire
treatment to
entire control
31
!
Treatment

!
!
!
Control

!
!
Called
vs.

Additional Offline vs. Online Considerations
32
+ Opt-in biases
+ You know you are in an experiment (Hawthorne/John Henry effects)
+ Monetary incentives impact external validity, trade-off take-up rate
+ Takes time to adjust to a change
+ Sample size may be limited by ops capacity
+ Stakeholders may be less data-savvy
+ Real people delivering the experiment!
+ Ethical considerations
!
Always partner with customer support.
!

Takeaways
+ Controlled experiments are the way to go if you want to make causal inference
+ Use them to optimize operations!
!
but:
+ Level of randomization - what impact do you want to measure?
+ Cannibalization
+ Compare the right groups - no selection bias
+ Break down results to get the most from the analysis
+ Be practical/ethical - you are dealing with real people here
33

!
!
Questions?
!
!
@elenatej
elena.grewal@airbnb.com
!
we’re hiring: www.airbnb.com/jobs

Airbnb offline experiments

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Airbnb offline experiments

Similaire à Airbnb offline experiments (20)

Dernier

Dernier (20)

Airbnb offline experiments