13. Why is Statistical Power important?
1. False negatives
2. False positives
14. Precision
Proportion of true positives in the positive
results
Its a function of power, significance level and
prevalence.
15. If you have good power?
Out of 100 tests
10 really drive uplift
You detect 8
5 false positives
8/13 of positive tests are real
16. If you have bad power?
Out of 100 tests
10 really drive uplift
You detect 3
5 false positives
3/8 of winning tests are real!
17. Marketer: ‘We need results in 2 weeks time’
Me: ‘We can’t run this test for only two weeks we won’t get robust results’
18. Marketer: ‘We need results in 2 weeks time’
Me: ‘We can’t run this test for only two weeks we won’t get robust results’
Marketer: ‘Why are you being so negative?’
19. Calculating Power
Alpha: probability of a positive result when
the null hypothesis is true (5%)
Beta: probability of not seeing a positive
result when the null hypothesis is true
Power = 1- Beta (80-90%)
20. Calculating Power
Use a power calculator:
Online
R (power.prop.test)
python (statsmodels.stats.power)
21. Approximate sample sizes
Using a power calculator and asking for 80%
power and significance level of 5%:
6000 conversions to detect 5% uplift
1600 conversions to detect 10% uplift
37. Regression to the mean
Give 100 students a true/false test
They all answer randomly
Take only the top scoring 10% of the class
Test them again
What will the results be?
42. What you need to do to get it right
● Do a power calculation first to estimate
sample size
● Use a valid hypothesis - don’t use a
scattergun approach
● Do not stop the test early
● Perform a second ‘validation’ test