1) Data analysis can lead to misleading conclusions if analysts do not account for causation and simply observe correlations in data. Correlation does not necessarily imply causation.
2) When analyzing treatments or interventions, it is important to control for confounding variables but not mediating variables to accurately assess causal relationships.
3) Variables that are actually colliders (common effects of other variables) can introduce spurious correlations if those other common causes are not accounted for.
4) Techniques from causal inference and probabilistic graphical models, like do-calculus, can help data scientists properly reason about and interpret causal effects and the results of interventions based on observational data.
5. 5
Which Treatment is Better?
Treatment A Treatment B
Small Stones 93% (81/87) 87% (234/270)
Large Stones 73% (192/263) 69% (55/80)
78% (273/350) 83% (289/350)
6. 6
Now, Which Treatment is Better?
Treatment A Treatment B
Low Blood pH 93% (81/87) 87% (234/270)
High Blood pH 73% (192/263) 69% (55/80)
78% (273/350) 83% (289/350)
8. 8
… Except in Both Halves of the Data?
lm(drat ~ carb, mtcars[
which(mtcars$cyl >= 6),])
lm(drat ~ carb, mtcars[
which(mtcars$cyl < 6),])
r = 0.52 r = 0.22
13. 13
Controlling Confounders is Right
Treatment A Treatment B
Small Stones 93% (81/87) 87% (234/270)
Large Stones 73% (192/263) 69% (55/80)
78% (273/350) 83% (289/350)
20. 20
do-Calculus
P(Y|X) ≠ P(Y|do(X))
Just because it’s more often raining when you
walk outside with an umbrella …
… doesn’t mean that you carrying an umbrella
makes it more likely to be raining.
22. 22
P(C|do(S)) =
∑
t
P(C|do(S), t)P(t|do(S))
=
∑
t
P(C|do(S), do(t))P(t|do(S))
=
∑
t
P(C|do(S), do(t))P(t|S)
=
∑
t
P(C|do(t))P(t|S)
=
∑
s′
∑
t
P(C|do(t), s′)P(s′|do(t))P(t|S)
=
∑
s′
∑
t
P(C|t, s′)P(s′|do(t))P(t|S)
=
∑
s′
∑
t
P(C|t, s′)P(s′)P(t|S)
23. Conclusion
• Must bring causal info to data for proper interpretation
• Know common causal pitfalls!
• PGMs help reason about causal effects
• Do-calculus can clarify reasoning about intervention
23