Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Data-Driven Off a Cliff
Anti-patterns in evidence-based decision making
Ketan Gangatirkar & Tom Wilbur
Data-Driven Off a Cliff
Anti-patterns in evidence-based decision making
Ketan Gangatirkar & Tom Wilbur
I help
people
get jobs.
Indeed is the #1 job site worldwide
Headquartered in Austin, Texas
We have tons of ideas
We have tons of bad ideas
Occasionally, we have good ideas
It’s hard to tell the difference
What helps people get jobs?
The only reliable way is to see what works
XKCD http://bit.ly/1JWz6Qh
We set up experiments
We collect results
We use the data to decide what to do
We’ve used data to make good decisions
But having data is not a silver bullet
We’ve also used data to make bad decisions
Science is hard
Problem
Running an experiment can ruin the experiment
Wikipedia http://bit.ly/1LkLPiP
Change Effect on productivity
Brighter light UP
Dimmer light UP
Warmer UP
Cooler UP
Shorter breaks UP
Longer breaks UP
Change Effect on productivity
Brighter light UP (temporarily)
Dimmer light UP (temporarily)
Warmer UP (temporarily)
Cooler...
Change Effect on productivity
Brighter light UP (temporarily)
Dimmer light UP (temporarily)
Warmer UP (temporarily)
Cooler...
Problem
Statistics are hard
Anscombe’s Quartet
Wikipedia http://bit.ly/2dlTUci
Simpson’s Paradox
Simpson’s Paradox
Wikipedia http://bit.ly/1OHFSOk
Using data is more than just statistics
+ + + +
=
Good math. Bad idea.
Bad practices can undermine good math
You don’t need me to teach you
to be bad at math
I’ll teach you to be bad at everything else
Anti-Lesson 01
Be impatient
p-value is the standard measure of
statistical significance
p-value is by measurement, not experiment
If you check results on Monday,
that’s one measurement
If you check results on Tuesday,
that’s another measurement
Got the result you want?
Declare victory!
Move quickly! Because
results and p-values can shift fast
of “winning” A/B tests stopped early
are false-positives
80%
http://bit.ly/1LtaLkV
Anti-Lesson 02
Sampling is easy
Beware the IEdes of March
Story
Building Used Cars Search
Shoppers specifying price, mileage
or year do better
Nudge shoppers to specify price,
mileage or year
+3% conversion
After rollout, conversion > +3%
Why?
We’d taken a shortcut in our
test assignment code
X
Users on oldest browsers got ignored
Distorted sample Distorted results
Anti-Lesson 03
Look only at one metric
If a little bit is good, a lot is great
Indeed has a heart
Story
❤ > ★ ?
+16% Saves on search results page
Everyone ❤s ❤s!
❤s everywhere!
Hearted
Not so fast
Did ❤ help people get jobs?
❤ jobs: +16%
Clicks: no change
Applies: no change
Hires: no change
I help
people
❤ jobs.
Upsell team
Story
We formed an “upsell team”
and measured their results
+ =
Success measure
It’s working!
Upsells
So why isn’t revenue moving?
Overall Revenue
+ 0 -
= ⅓+⅓ -⅓
What you measure is what you motivate
Redefine success to include all outcomes
Upsell Team revenue +200%
Anti-Lesson 03: Reloaded
Look at all the metrics
It's better for them. Is it better for us?
Job applications: Up
Job clicks: Down
Recommended Jobs traffic: Up
Job views: Sideways
New resumes: Up
Return visits: Down...
We didn’t really know what we wanted
Too much noise from too many metrics
I help
people
get jobs.
Anti-Lesson 04
Be sloppy with your analysis
We engineer features rigorously
Specification
Source control
Code review
Automated tests
Manual QA
Metrics
Monitors
...
But analysis…
Bad analysis won’t take down Indeed.com
200 million job seekers don’t care
about our sales projections
So we don’t try as hard with analysis code
Specification
Source control
Code review
Automated tests
Manual QA
Metrics
Monitors
...
Dubliners
Story
Indeed reports on economic trends
South Carolinians wanted to move to Dublin
Dublin?
No, the other one
Incorrect IP location mapping
IP blocks for South Carolina
got reallocated to London, England
Worse things can happen
Growth and Debt
Story
“Growth in a Time of Debt”
Carmen Reinhart and Kenneth Rogoff
2010
Public debt > 90% GDP
leads to slower economic growth
Governments made policy based on this
Fixing the error eliminated the effect
Source: https://goo.gl/zAcd1e
Genetic Mutation
Story
20% of genetics papers have Excel errors
Source: http://wapo.st/2cWyrpJ
SEPT2 to a geneticist is Septin 2
SEPT2 to Excel is 42615
Does your company use spreadsheets?
How do you know they’re correct?
Under-spending Advertisers
Story
Employer budgets ran out
before the end of the day
So no evening job seekers saw the jobs
How big was this missed opportunity?
Clicks received
1260
Out of budget time
20:00
% of day w/o budget
0.1667
Potential clicks
1260 / (1 - 0.1667) = 1512
Misse...
Assumption
100
75
50
25
0
0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00
Missed = 260 clicks (+20%)
0:...
Reality
100
75
50
25
0
Missed = 100 clicks (+8%)
2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 0:000:00
Naive analysis bad recommendation
Anti-Lesson 05
Only look for expected outcomes
Zero results pages from misspelled locations
Goals: fewer ZRPs, more job clicks
Zero-results pages
-2.7%
Job clicks
+8%
+1,410%
Ad revenue
+1,410%
Ad revenue
ads
Ad revenue after fix
Treatment on
homepage
Effect on
search page
Anti-Lesson 06
Metrics, not stories
I help
people
get jobs.
How do I know if people got jobs?
I need employers to tell me
One employer hired 4500 people in 45 minutes!
Nope
Accurate recording of outcomes helps us
It doesn’t help employers
They don't care about using
the product “right”
Go away!
There is no “user story”
Right metrics + wrong story = wrong conclusion
Anti-Lesson 06: Parte Deux
Story over metrics
Stories are seductive
Even incorrect stories are seductive
Taste Buds
Story
Taste map
Totally wrong
Every bite you eat proves it’s wrong
People still believe it
Job Alerts
Story
Success for emails is well understood
New subscriptions: Good
Email opens: Good
Clicking on stuff: Good
Unsubscribing: Bad
I help
people
get emails.
I help
people
get jobs.
What does job seeker success look like?
01
Search for jobs
02
Sign up for alerts
03
Click on some jobs
04
Apply to some jobs
05
Get a job!
06
Unsubscribe from emails
People with new jobs don't need job alerts
The standard story for email fails here
Light and Dark Redux
Story
It’s a persuasive story
But the original study was flawed
Hawthorne Revisited
… the variance in productivity could be fully accounted for
by the fact that the lighting changes were...
We con people with stories
We con ourselves with stories
Anti-Lesson 07
Believe in yourself
Believing in yourself can be good
“My startup will succeed.”
Often it’s bad
“I’d never fall for a scam like that.”
“I knew it all along.”
“I’m too smart to make that mistake.”
Every story of mistakes is deceptive
We tell stories with 20/20 hindsight
When we live the story, we live in the fog
You won’t think you’re making a mistake
Search your past for mistakes
Painful, embarrassing mistakes
If you didn’t find any, you’re exceptional
Either you’re making mistakes you find
Or you’re making mistakes you don’t find
How do you defend against mistakes?
The first step is admitting you have a problem
There are 174 cognitive biases
[citation needed]
Data can help you make better decisions
Or more confidently make bad decisions
Data can’t make you a better decision-maker
Good data + bad decision-maker = bad decision
Our anti-lessons teach you
how to use data badly
Do the opposite to do better
Lesson 01
Lesson 02
Lesson 03
Lesson 04
Lesson 05
Lesson 06
Lesson 07
Be patient
Sampling is hard
Focus on a few, carefull...
Learn from our mistakes
Be prepared for your own
Learn More
Engineering blog & talks http://indeed.tech
Open Source http://opensource.indeedeng.io
Careers http://indeed.jo...
Questions?
Contact us
ketan@indeed.com | twilbur@indeed.com
Seriously, that was the end
Contact us
ketan@indeed.com | twilbur@indeed.com
There are no more slides
Contact us
ketan@indeed.com | twilbur@indeed.com
Stop here
Contact us
ketan@indeed.com | twilbur@indeed.com
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making
Prochain SlideShare
Chargement dans…5
×

Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making

Link to this presentation: http://engineering.indeedblog.com/talks/data-driven-off-a-cliff/

  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Data-Driven off a Cliff: Anti-Patterns in Evidence-Based Decision Making

  1. 1. Data-Driven Off a Cliff Anti-patterns in evidence-based decision making Ketan Gangatirkar & Tom Wilbur
  2. 2. Data-Driven Off a Cliff Anti-patterns in evidence-based decision making Ketan Gangatirkar & Tom Wilbur
  3. 3. I help people get jobs.
  4. 4. Indeed is the #1 job site worldwide
  5. 5. Headquartered in Austin, Texas
  6. 6. We have tons of ideas
  7. 7. We have tons of bad ideas
  8. 8. Occasionally, we have good ideas
  9. 9. It’s hard to tell the difference
  10. 10. What helps people get jobs?
  11. 11. The only reliable way is to see what works
  12. 12. XKCD http://bit.ly/1JWz6Qh
  13. 13. We set up experiments
  14. 14. We collect results
  15. 15. We use the data to decide what to do
  16. 16. We’ve used data to make good decisions
  17. 17. But having data is not a silver bullet
  18. 18. We’ve also used data to make bad decisions
  19. 19. Science is hard
  20. 20. Problem Running an experiment can ruin the experiment
  21. 21. Wikipedia http://bit.ly/1LkLPiP
  22. 22. Change Effect on productivity Brighter light UP Dimmer light UP Warmer UP Cooler UP Shorter breaks UP Longer breaks UP
  23. 23. Change Effect on productivity Brighter light UP (temporarily) Dimmer light UP (temporarily) Warmer UP (temporarily) Cooler UP (temporarily) Shorter breaks UP (temporarily) Longer breaks UP (temporarily)
  24. 24. Change Effect on productivity Brighter light UP (temporarily) Dimmer light UP (temporarily) Warmer UP (temporarily) Cooler UP (temporarily) Shorter breaks UP (temporarily) Longer breaks UP (temporarily)
  25. 25. Problem Statistics are hard
  26. 26. Anscombe’s Quartet Wikipedia http://bit.ly/2dlTUci
  27. 27. Simpson’s Paradox
  28. 28. Simpson’s Paradox Wikipedia http://bit.ly/1OHFSOk
  29. 29. Using data is more than just statistics
  30. 30. + + + + = Good math. Bad idea.
  31. 31. Bad practices can undermine good math
  32. 32. You don’t need me to teach you to be bad at math
  33. 33. I’ll teach you to be bad at everything else
  34. 34. Anti-Lesson 01 Be impatient
  35. 35. p-value is the standard measure of statistical significance
  36. 36. p-value is by measurement, not experiment
  37. 37. If you check results on Monday, that’s one measurement
  38. 38. If you check results on Tuesday, that’s another measurement
  39. 39. Got the result you want?
  40. 40. Declare victory!
  41. 41. Move quickly! Because results and p-values can shift fast
  42. 42. of “winning” A/B tests stopped early are false-positives 80% http://bit.ly/1LtaLkV
  43. 43. Anti-Lesson 02 Sampling is easy
  44. 44. Beware the IEdes of March Story
  45. 45. Building Used Cars Search
  46. 46. Shoppers specifying price, mileage or year do better
  47. 47. Nudge shoppers to specify price, mileage or year
  48. 48. +3% conversion
  49. 49. After rollout, conversion > +3%
  50. 50. Why?
  51. 51. We’d taken a shortcut in our test assignment code X
  52. 52. Users on oldest browsers got ignored
  53. 53. Distorted sample Distorted results
  54. 54. Anti-Lesson 03 Look only at one metric
  55. 55. If a little bit is good, a lot is great
  56. 56. Indeed has a heart Story
  57. 57. ❤ > ★ ?
  58. 58. +16% Saves on search results page
  59. 59. Everyone ❤s ❤s!
  60. 60. ❤s everywhere!
  61. 61. Hearted
  62. 62. Not so fast
  63. 63. Did ❤ help people get jobs?
  64. 64. ❤ jobs: +16% Clicks: no change Applies: no change Hires: no change
  65. 65. I help people ❤ jobs.
  66. 66. Upsell team Story
  67. 67. We formed an “upsell team” and measured their results
  68. 68. + = Success measure
  69. 69. It’s working! Upsells
  70. 70. So why isn’t revenue moving? Overall Revenue
  71. 71. + 0 -
  72. 72. = ⅓+⅓ -⅓
  73. 73. What you measure is what you motivate
  74. 74. Redefine success to include all outcomes
  75. 75. Upsell Team revenue +200%
  76. 76. Anti-Lesson 03: Reloaded Look at all the metrics
  77. 77. It's better for them. Is it better for us?
  78. 78. Job applications: Up Job clicks: Down Recommended Jobs traffic: Up Job views: Sideways New resumes: Up Return visits: Down Logins: Up Revenue: Down (and it goes on…)
  79. 79. We didn’t really know what we wanted
  80. 80. Too much noise from too many metrics
  81. 81. I help people get jobs.
  82. 82. Anti-Lesson 04 Be sloppy with your analysis
  83. 83. We engineer features rigorously
  84. 84. Specification Source control Code review Automated tests Manual QA Metrics Monitors ...
  85. 85. But analysis…
  86. 86. Bad analysis won’t take down Indeed.com
  87. 87. 200 million job seekers don’t care about our sales projections
  88. 88. So we don’t try as hard with analysis code
  89. 89. Specification Source control Code review Automated tests Manual QA Metrics Monitors ...
  90. 90. Dubliners Story
  91. 91. Indeed reports on economic trends
  92. 92. South Carolinians wanted to move to Dublin
  93. 93. Dublin?
  94. 94. No, the other one
  95. 95. Incorrect IP location mapping
  96. 96. IP blocks for South Carolina got reallocated to London, England
  97. 97. Worse things can happen
  98. 98. Growth and Debt Story
  99. 99. “Growth in a Time of Debt” Carmen Reinhart and Kenneth Rogoff 2010
  100. 100. Public debt > 90% GDP leads to slower economic growth
  101. 101. Governments made policy based on this
  102. 102. Fixing the error eliminated the effect Source: https://goo.gl/zAcd1e
  103. 103. Genetic Mutation Story
  104. 104. 20% of genetics papers have Excel errors Source: http://wapo.st/2cWyrpJ
  105. 105. SEPT2 to a geneticist is Septin 2
  106. 106. SEPT2 to Excel is 42615
  107. 107. Does your company use spreadsheets?
  108. 108. How do you know they’re correct?
  109. 109. Under-spending Advertisers Story
  110. 110. Employer budgets ran out before the end of the day
  111. 111. So no evening job seekers saw the jobs
  112. 112. How big was this missed opportunity?
  113. 113. Clicks received 1260 Out of budget time 20:00 % of day w/o budget 0.1667 Potential clicks 1260 / (1 - 0.1667) = 1512 Missed clicks 1512 * 0.1667 = 260 Missed Clicks Report Dear Customer, You got 1,260 clicks yesterday. Your daily budget ran out at 8:00pm. If you funded your budget through the whole day, you’d get another 260 clicks - a +20% improvement! Get More Clicks
  114. 114. Assumption 100 75 50 25 0 0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 Missed = 260 clicks (+20%) 0:00
  115. 115. Reality 100 75 50 25 0 Missed = 100 clicks (+8%) 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00 0:000:00
  116. 116. Naive analysis bad recommendation
  117. 117. Anti-Lesson 05 Only look for expected outcomes
  118. 118. Zero results pages from misspelled locations
  119. 119. Goals: fewer ZRPs, more job clicks
  120. 120. Zero-results pages -2.7%
  121. 121. Job clicks +8%
  122. 122. +1,410% Ad revenue
  123. 123. +1,410% Ad revenue
  124. 124. ads
  125. 125. Ad revenue after fix
  126. 126. Treatment on homepage Effect on search page
  127. 127. Anti-Lesson 06 Metrics, not stories
  128. 128. I help people get jobs.
  129. 129. How do I know if people got jobs?
  130. 130. I need employers to tell me
  131. 131. One employer hired 4500 people in 45 minutes!
  132. 132. Nope
  133. 133. Accurate recording of outcomes helps us
  134. 134. It doesn’t help employers
  135. 135. They don't care about using the product “right”
  136. 136. Go away!
  137. 137. There is no “user story”
  138. 138. Right metrics + wrong story = wrong conclusion
  139. 139. Anti-Lesson 06: Parte Deux Story over metrics
  140. 140. Stories are seductive
  141. 141. Even incorrect stories are seductive
  142. 142. Taste Buds Story
  143. 143. Taste map
  144. 144. Totally wrong
  145. 145. Every bite you eat proves it’s wrong
  146. 146. People still believe it
  147. 147. Job Alerts Story
  148. 148. Success for emails is well understood
  149. 149. New subscriptions: Good Email opens: Good Clicking on stuff: Good Unsubscribing: Bad
  150. 150. I help people get emails.
  151. 151. I help people get jobs.
  152. 152. What does job seeker success look like?
  153. 153. 01 Search for jobs
  154. 154. 02 Sign up for alerts
  155. 155. 03 Click on some jobs
  156. 156. 04 Apply to some jobs
  157. 157. 05 Get a job!
  158. 158. 06 Unsubscribe from emails
  159. 159. People with new jobs don't need job alerts
  160. 160. The standard story for email fails here
  161. 161. Light and Dark Redux Story
  162. 162. It’s a persuasive story
  163. 163. But the original study was flawed
  164. 164. Hawthorne Revisited … the variance in productivity could be fully accounted for by the fact that the lighting changes were made on Sundays and therefore followed by Mondays when workers’ productivity was refreshed by a day off.” https://en.wikipedia.org/wiki/Hawthorne_effect
  165. 165. We con people with stories
  166. 166. We con ourselves with stories
  167. 167. Anti-Lesson 07 Believe in yourself
  168. 168. Believing in yourself can be good
  169. 169. “My startup will succeed.”
  170. 170. Often it’s bad
  171. 171. “I’d never fall for a scam like that.”
  172. 172. “I knew it all along.”
  173. 173. “I’m too smart to make that mistake.”
  174. 174. Every story of mistakes is deceptive
  175. 175. We tell stories with 20/20 hindsight
  176. 176. When we live the story, we live in the fog
  177. 177. You won’t think you’re making a mistake
  178. 178. Search your past for mistakes
  179. 179. Painful, embarrassing mistakes
  180. 180. If you didn’t find any, you’re exceptional
  181. 181. Either you’re making mistakes you find
  182. 182. Or you’re making mistakes you don’t find
  183. 183. How do you defend against mistakes?
  184. 184. The first step is admitting you have a problem
  185. 185. There are 174 cognitive biases [citation needed]
  186. 186. Data can help you make better decisions
  187. 187. Or more confidently make bad decisions
  188. 188. Data can’t make you a better decision-maker
  189. 189. Good data + bad decision-maker = bad decision
  190. 190. Our anti-lessons teach you how to use data badly
  191. 191. Do the opposite to do better
  192. 192. Lesson 01 Lesson 02 Lesson 03 Lesson 04 Lesson 05 Lesson 06 Lesson 07 Be patient Sampling is hard Focus on a few, carefully chosen metrics Be rigorous with your analysis Watch out for side effects Use metrics and stories Plan for fallibility
  193. 193. Learn from our mistakes
  194. 194. Be prepared for your own
  195. 195. Learn More Engineering blog & talks http://indeed.tech Open Source http://opensource.indeedeng.io Careers http://indeed.jobs Twitter @IndeedEng
  196. 196. Questions? Contact us ketan@indeed.com | twilbur@indeed.com
  197. 197. Seriously, that was the end Contact us ketan@indeed.com | twilbur@indeed.com
  198. 198. There are no more slides Contact us ketan@indeed.com | twilbur@indeed.com
  199. 199. Stop here Contact us ketan@indeed.com | twilbur@indeed.com

×