Gentle intro to Bayesian Statistics and how it's different from classical frequentist statistics. Assumes you have basic statistical knowledge.
Why "Am I pregnant?" is a question more suitable for Bayesian techniques and not actually suitable at all for Frequentist techniques!
6. Movie cliché: Am I pregnant?
● What did I do in the past month?
– Forms a prior belief of whether I am pregnant
7. Movie cliché: Am I pregnant?
● What did I do in the past month?
– Forms a prior belief of whether I am pregnant
● The missing period
– Data!
8. Movie cliché: Am I pregnant?
● What did I do in the past month?
– Forms a prior belief of whether I am pregnant
● The missing period
– Data!
● Belief is updated as more data is observed!
9. Bayesian terminology
● Prior: your belief about pregnancy before
seeing new data
● Data: missing period
● Posterior: your belief that is updated after
seeing the data
10. How do we formalize this update?
● Pregnant is a uncertain event with two
outcomes: Yes or No
11. How do we formalize this update?
● Pregnant is a uncertain event with two
outcomes: Yes or No
● “Days delayed of period” is a data point
– If (Pregnant = Yes), delayed ~ 30*9 days
– If (Pregnant = No), it might come sooner
12. Mathematical framework
● “Pregnant” is a random variable:
– P(Pregnant = Yes) = X
– P(Pregnant = No) = (1 - X)
13. Mathematical framework
● “Pregnant” is a random variable:
– P(Pregnant = Yes) = X
– P(Pregnant = No) = (1 - X)
● “Days delayed of period” is another random
variable!
– P(days delay >= 7 days | Pregnant) = 1
– P(days delay >= 7 days | Not Pregnant) = Y
14. Simplify
● Start with the objective:
Am I pregnant?
i.e. P(Pregnant | Data)?
15. Simplify
● Start with the objective:
Am I pregnant?
i.e. P(Pregnant | Data)?
● Note all the numbers we know are the form of
P( **** | Pregnant)
17. Conditional Probability!
P(Pregnant | Data)
= P(Data | Pregnant) P(Pregnant) / P(Data)
Immediate implication:
● If your prior says you cannot be pregnant,
your belief cannot be changed!
19. “Bayes Rule”
P(Pregnant | Data)
= P(Data | Pregnant) P(Pregnant) / P(Data)
= P(Data | Pregnant) P(Pregnant) /
[ P(Data | Pregnant) P(Pregnant) +
P(Data | Not Pregnant) P(Not Pregnant) ]
Why add more numbers?
P(Data) was hard to compute, so chop it into
pieces we know!
20. P(Data): Big Issue for Bayesians
● Pregnant is binary which made this realllllly
easy
● In general, a lot of “tricks” are trying to
– solve for P(Data)
● Belief propagation in graphical models
– getting around it
● Sampling: MCMC
● Approximation: Variational Bayes
21. Back to the key question:
P(Pregnant | Data)
= P(Data | Pregnant) P(Pregnant) /
[ P(Data | Pregnant) P(Pregnant) +
P(Data | Not Pregnant) P(Not Pregnant) ]
= 1 * X / [ 1 * X + Y * (1 - X) ]
23. Can add more data
….....almost for free!
● Notice “Data” is quite general:
– Can add pregnancy strips data to further
update beliefs!
– Treat previous outputs as priors then update
similarly!
24. So.....what's the big deal?
● Your belief matters a lot!
– Your prior changes the outcome
● Your prior and my prior may be different
25. What “could” a bad Frequentist
Do?
● Calculate the p-value for you, i.e.
P(Late period | Not Pregnant)
● Declare that you're Pregnant if this is <= 5%
26. What “could” a bad Frequentist
Do?
● Calculate the p-value for you, i.e.
P(Late period | Not Pregnant)
● Declare that you're Pregnant if this is <= 5%
● Declaration has 5% false positive and a
certain false negative rates
27. What “could” a bad Frequentist
Do?
● Calculate the p-value for you, i.e.
P(Late period | Not Pregnant)
● Declare that you're Pregnant if this is <= 5%
● Declaration has 5% false positive and a
certain false negative rates
● Issue: Not as relevant to you! Rates are for all
the people using this procedure...not specific
to your case!
28. “not as relevant”?
● There's no consideration of your specific case
– There was no P(Pregnant) in the p-value
calculation
– You could be really sure that you're not
pregnant....doesn't change the calculation!
29. What would a Frequentist say?
● P(Pregnant) = 100% or 0%
– Fixed but unknown
– NOT uncertain
● …Not actually interested in a single event
– Probabilities are defined for repeated events
– Will not write down P(Pregnant | Data)
– For your one case, anything could be true
30. What would a Frequentist say?
● P(Pregnant) = 100% or 0%
– Fixed but unknown
– NOT uncertain
● …Not actually interested in a single event
– Probabilities are defined for repeated events
– Will not write down P(Pregnant | Data)
– For your one case, anything could be true
● Would say “Go talk to a doctor”
31. Key difference
● “Attitude”
– What can be a random variable?
● Bayesian: Uncertain events
● Frequentist: Repeatable events
32. Implications of this attitude
● Bayesian:
– Can incorporate prior knowledge easily
– Can update beliefs easily
– Can tackle a wider class of problems since
probabilities are “beliefs”
33. Implications of this attitude
● Bayesian:
– Can incorporate prior knowledge easily
– Can update beliefs easily
– Can tackle a wider class of problems since
probabilities are “beliefs”
– Must specify a model
– Your belief can be different from mine
● Our answers will be different!
34. Implications of this attitude
● Frequentist:
– Probabilities are more objective
– Harder to cheat
– Has non-parametric methods
35. Implications of this attitude
● Frequentist:
– Probabilities are more objective
– Harder to cheat
– Has non-parametric methods
– Focused on repeatable events
– Prior knowledge is introduced in an ad hoc
format
– Usually need lots of data
36. In the end...
● Frequentist and Bayesian use the same rules
of probabilities
● Difference exists in set-up: “What is random?”
– Bayesians: uncertainty in knowledge
– Frequentist: intrinsic randomness
37. Take Home
● Different problems should use different
approaches!
– Both schools are awesome!~
● Be aware of what you're using and be
consistent!