2. Lecture 8 Review –
Proportions and confidence intervals
• Calculation and interpretation of:
– sample proportion
– 95% confidence interval for population proportion
• Calculation and interpretation of:
–
–
difference in sample proportions
95% confidence interval for difference
proportions
in population
3. Single proportion – Inference
•
•
Estimated proportion of vivax malaria (p) = 15/100 = 0.15
Standard error of p
p(1− p) 0.15(1−0.15)
s e ( p ). . = = = 0.036
n 100
• 95% Confidence interval for π (population proportion)
–
–
Lower limit = p - 1.96×s.e.(p) = 0.079
Upper limit = p + 1.96×s.e.(p) = 0.221
Interpretation..
“We are 95% confident, the population proportion (π) of people with vivax
malaria is between 0.079 and 0.221
(or between 7.9% and 22.1%)”
4. Comparing two proportions
2×2 table
•
•
•
Proportion
Proportion
Proportion
of all subjects experiencing outcome, p = d/n
in exposed group, p1 = d1/n1
in unexposed group, p0 = d0/n0
With outcome
(diseased)
Without outcome
(disease-free)
Total
Exposed
(group 1)
d1 h1 n1
Unexposed
(group 0)
d0 h0 n0
Total d h n
5. Comparing two proportions
Example – TBM trial
Death during 9 months post start
of treatment
Treatment group Yes No Total
Dexamethasone
(group 1)
87
(p1=0.318)
187 274
Placebo
(group 0)
112
(p0=0.413)
159 271
Total 199 346 545
6. Comparing two proportions - Inference
Example:- TBM trial
Estimate of difference in population proportions
= p1-p0 = -0.095
s.e.(p1-p0) = 0.041
95% CI for difference in population proportions (π1-π0):
-0.095 ± 1.96×0.041
-0.175 up to -0.015 OR -17.5% up to -1.5%
Interpretation:-
“We are 95% confident, that the difference in population proportions is
between -17.5% (dexamethasone reduces the proportion of deaths by a
large amount) and -1.5% (dexamethasone marginally reduces the
proportion of deaths)”.
7. Comparing two proportions (absolute difference):-
Risk difference
Example:- TBM trial
Outcome measure: Death during nine months
treatment.
following start of
Dexamethasone
p1 (incidence risk) = d1/n1 = 87/274 = 0.318
Placebo
p0 (incidence risk) = d0/n0 = 112/271 = 0.413
p1 – p0 (risk difference) = 0.318 – 0.413 = -0.095 (or -9.5%)
8. Lecture 9 – Measures of association
• 2×2 table (RECAP)
• Measures of association
–
–
Risk difference
Risk ratio
– Odds ratio
• Calculation & interpretation of confidence interval for
each measure of association
9. 2×2 table
•
•
•
Proportion
Proportion
Proportion
of all subjects experiencing outcome, p = d/n
in exposed group, p1 = d1/n1
in unexposed group, p0 = d0/n0
With outcome
(diseased)
Without outcome
(disease-free)
Total
Exposed
(group 1)
d1 h1 n1
Unexposed
(group 0)
d0 h0 n0
Total d h n
10. 2×2 table - Measures of association
• Different
between
measures of association
outcome and exposure
• Can calculate confidence intervals and test statistics for
each measure
Measure of Effect Formula
Risk difference
(lecture 8)
p1-p0
Risk ratio (relative risk) p1 / p0
Odds ratio (d1/h1) / (d0/h0)
11. 2×2 table – TBM trial example
Death during 9
months post start
of treatment
Treatment group Yes No Total Incidence risk
of death (p)
Odds of death
Dexamethasone
(group 1)
87
(d1)
187
(h1)
274
(n1)
d1 / n1
= 0.318
d1 / h1
= 0.465
Placebo
(group 0)
112
(d0)
159
(h0)
271
(n0)
d0 / n0
= 0.413
d0 / h0
= 0.704
Total 199 346 545
12. 2×2 table – TBM trial example
• Risk difference = p1-p0 = 0.318 – 0.413 = -0.095 (or -9.5%)
• Risk ratio = p1/p0 = 0.318 / 0.413 = 0.77
• Odds ratio = (d1/h1) / (d0/h0) = 0.465 / 0.704 = 0.66
Death during 9 months post start
of treatment
Treatment group Yes No Total
Dexamethasone
(group 1)
87 (d1) 187 (h1) 274 (n1)
Placebo
(group 0)
112 (d0) 159 (h0) 271 (n0)
Total 199 346 545
13. 2×2 table – Calculation of Odds Ratio
Commonly given formula for odds ratio
(a×d) / (b×c) = (87×159) / (187×112) = 0.66
Death during 9 months post start
of treatment
Treatment group Yes No Total
Dexamethasone
(group 1)
87 (a) 187 (b) 274 (n1)
Placebo
(group 0)
112 (c) 159 (d) 271 (n0)
Total 199 346 545
14. 2×2 table – Calculation of Odds Ratio
Odds ratio for not dying
= (a×d) / (b×c) = (187×112) /
(=1/0.66)
(87×159) = 1.51
Death during 9 months post start
of treatment
Treatment group No Yes Total
Dexamethasone
(group 1)
187 (a) 87 (b) 274 (n1)
Placebo
(group 0)
159 (c) 112 (d) 271 (n0)
Total 346 199 545
15. Differences in measures of association
• When there is no association between exposure and outcome,
– risk difference = 0
– risk ratio (RR) = 1
– odds ratio (OR) = 1
• Risk difference can be negative or positive
• RR & OR are always positive
• For rare outcomes, OR ~ RR
• OR is always further from 1 than corresponding RR
– If RR > 1 then OR > RR
– If RR < 1 the OR < RR
16. Interpretation of measures of association
• RR & OR < 1, associated with a reduced risk / odds (may be
protective)
– RR = 0.8 (reduced risk of 20%)
• RR & OR > 1, associated with an increased risk / odds
– RR = 1.2 (increased risk of 20%)
• RR & OR – further the risk is from 1, stronger the association
between exposure and outcome (e.g. RR=2 versus RR=3).
17. Inference
• Obtain a sample estimate, q, of the population parameter (e.g.
difference in proportions)
• REMEMBER different samples would give different estimates
of the population parameter (e.g. sample 1 q1, sample 2 q2,…)
• Derive:
– Standard error of q (i.e. s.e.(q))
– Confidence interval (i.e. q ± (1.96 × s.e.(q) )
18. Ratios – Risk ratio (RR) or Odds ratio (OR)
• Usual confidence intervals formula,
q ± (1.96×s.e.(q)), is problematic for ratios.
When q is close to zero and s.e.(q) large,
calculated lower limit of confidence interval may be
negative…
19. Risk ratio (RR)
• Solution Calculate the logarithm of
(logeRR) and its standard error
RR
1
−
1 1
−
1
s.e.(lo g
e
RR ) = +
d1 n1 d0 n0
95% CI for logarithm of RR :-
Upper limit
Lower limit
= logeRR
= logeRR
+ 1.96×s.e.(logeRR)
- 1.96×s.e.(logeRR)
95% CI for Risk ratio (RR):-
Upper limit = antilog (upper limit of CI for logeRR)
Lower limit = antilog (lower limit of CI for logeRR)
20. Log to the base e & antiloge (exponential)
• ‘Natural logarithms’ use the mathematical constant, e, as
their base, e=2.71828……
1618 – Scottish
Mathematician: John Napierex• antilogex = exp(x) =
e = 2.718 loge2.718 = 1 antiloge1 = 2.718
e2 = 7.388 loge7.388 = 2 antiloge2 = 7.388
e3 = 20.079 loge20.079 = 3 antiloge3 = 20.079
101 = 10 log1010 = 1 antilog101 = 10
102 = 100 log10100 = 2 antilog102 = 100
103 = 1000 log101000 = 3 antilog103 = 1000
21. 2×2 table – TBM trial example
Risk ratio = p1/p0 = 0.318 / 0.413 =
logeRR = loge(0.77) = -0.26
0.77
1
−
1
+
1
−
1
s.e.(lo g RR ) =e
= 0.11
87 274 112 271
95% CI for logeRR: -0.48 up to -0.04
95% CI for RR: exp(-0.48) up to exp(-0.04) = 0.62 up to 0.96
Death during 9 months post start
of treatment
Treatment group Yes No Total
Dexamethasone
(group 1)
87 (d1) 187 (h1) 274 (n1)
Placebo
(group 0)
112 (d0) 159 (h0) 271 (n0)
Total 199 346 545
22. Using Stata
csi 87 112 187 159
| Exposed Unexposed | Total
-----------------+------------------------+------------
Cases |
Noncases |
87
187
112
159
|
|
199
346
-----------------+------------------------+------------
Total |
|
|
|
|
274 271 |
|
|
|
|
545
Risk .3175182 .4132841 .3651376
Point estimate [95% Conf. Interval]
|------------------------+------------------------
Risk difference
Risk ratio
Prev. frac. ex.
Prev. frac. pop
|
|
|
|
-.0957659
.7682808
.2317192
.1164974
|
|
|
|
-.1762352 -.0152966
.6139856 .9613505
.0386495 .3860144
+-------------------------------------------------
chi2(1) = 5.39 Pr>chi2 = 0.0202
Remember the warning about how the table is presented
-Stata requires presentation with outcome by rows and exposure by
columns
Results are close to those obtained by hand
23. 2×2 table – TBM trial example
Interpretation…..
Dexamethasone was associated with an estimated decreased
risk of 23% (estimated RR=0.77) for death during 9 months post
start of treatment.
We are 95% confident, that the population risk ratio, lies between
0.62 (decreased risk of 38%) and 0.96 (decreased risk of 4%).
Death during 9 months post start
of treatment
Treatment group Yes No Total
Dexamethasone
(group 1)
87 (d1) 187 (h1) 274 (n1)
Placebo
(group 0)
112 (d0) 159 (h0) 271 (n0)
Total 199 346 545
24. 95% confidence interval for Odds ratio (OR)
• Calculate the logarithm of OR (logeOR) and its standard error.
1 1 1 1Woolf’s formula
s.e.(lo g OR) =e
+ + +
d1 h1 d0 h0
95% CI for logarithm of OR :-
Upper limit = logeOR + 1.96×s.e.(logeOR)
Lower limit = logeOR - 1.96×s.e.(logeOR)
95% CI for Odds ratio (OR):-
Upper limit = exp (upper limit of CI for logeOR)
Lower limit = exp (lower limit of CI for logeOR)
25. 2×2 table – TBM trial example
Odds Ratio = (d1/h1)/ (d0/h0) = 0.66
logeOR = loge(0.66) = -0.42
1
+
1
+
1
+
1
s.e.(lo g OR ) =e
= 0.18
87 187 112 159
95% CI for logeOR: -0.77 up to -0.07
95% CI for OR: exp(-0.77) up to exp(-0.07) = 0.46 up to 0.93
Death during 9 months post start
of treatment
Treatment group Yes No Total
Dexamethasone
(group 1)
87 (d1) 187 (h1) 274 (n1)
Placebo
(group 0)
112 (d0) 159 (h0) 271 (n0)
Total 199 346 545
26. Using Stata
. csi 87 112 187 159, or
| Exposed Unexposed | Total
-----------------+------------------------+------------
Cases |
Noncases |
87
187
112
159
|
|
199
346
-----------------+------------------------+------------
Total |
|
|
274 271 |
|
|
545
Risk .3175182 .4132841 .3651376
|
|
|
|Point estimate [95% Conf. Interval]
|------------------------+------------------------
Risk difference
Risk ratio
Prev. frac. ex.
Prev. frac. pop
Odds ratio
|
|
|
|
|
-.0957659
.7682808
.2317192
.1164974
.6604756
|
|
|
|
|
-.1762352
.6139856
.0386495
-.0152966
.9613505
.3860144
.4652544 .937623 (Cornfield)
+-------------------------------------------------
chi2(1) = 5.39 Pr>chi2 = 0.0202
For OR, by default Stata uses Cornfield’s formula for se. You can request
the Woolf formula as csi 87 112 187 159, or woolf
27. Test statistic for
Risk ratio (RR) & Odds ratio (OR)
Null hypothesis:-
population RR = 1 or population OR = 1
• For risk ratio:-
log e RR − log e1 − 0.26 − 0
z = = = −2.4
s.e.(log RR ) 0.11e
2-sided p-value = 0.016
28. Test statistic for
Risk ratio (RR) & Odds
Null hypothesis:-
ratio (OR)
population RR = 1 or population OR = 1
• For odds ratio:-
log e OR − log e1 − 0.42 −
0z = = = −2.3
s.e.(log OR) 0.18e
2-sided p-value = 0.021
29. Comparing the outcome measure of two exposure groups
(groups 1 & 0)
1 0 1 01 0
s.e.( p ) + s.e.( p )
1 0
p
=
Outcome
variable –
data type
Population
parameter
Estimate of
population
parameter
from sample
Standard error 95% Confidence
interval for population
parameter
Numerical
µ1−µ0 x1 x0
s.e.( x1 − x 0 )
2 2
= s.e.( x1 ) + s.e.( x 0 )
x1 − x0
±1.96× s.e.( x1 − x 0 )
Categorical
π1−π0 p − p s.e.( p − p )
2 2
p − p
±1.96×s.e.( 1 − p0 )
30. Comparing the outcome measure of two exposure groups
(groups 1 & 0)
s.e.(lo g RR ) =e
− + −
1 1 0 0
Outcome
variable –
data type
Population
parameter
Estimate
of
population
parameter
from
sample
Standard error of
loge(parameter)
95% Confidence interval of
loge(population parameter)
Categorical
Population
risk ratio
p1/p0 1 1 1 1
d1 n1 d0 n0
log eRR
±1.96× s.e.(log eRR )
Categorical Population
odds ratio
(d1/h1) /
(d0/h0)
1 1 1 1
s.e.(lo ge OR) =
d
+
h
+
d
+
h
logeOR
±1.96×s.e.(log eOR)
31. Calculation of p-values for comparing two groups
z =
s.e.( p − p )
z =
e
s.e.(log (OR ))
Outcome
variable –
data type
Population parameter Population parameter
under null hypothesis
Test statistic
Numerical
µ1−µ0 µ1−µ0=0 x1 − x0
s.e.( x1 − x 0 )
Categorical
π1-π0
Population risk ratio
Population odds ratio
π1-π0=0
Population risk ratio=1
Population odds ratio=1
z =
p1 − p0
1 0
loge (RR)
s.e.(log ( RR ))
z =
loge (OR)
e
32. Comparing the outcome measure of two exposure groups
(TBM trial: dexamethasone versus placebo)
Outcome
variable –
data type
Population
parameter
under null
hypothesis
Estimate of
population
parameter
from sample
95% confidence
interval for
population
parameter
Two-sided p-value
Categorical Population
risk
difference
= 0
p1-p0
= -0.095
-0.175, -0.015 0.020
Categorical
Population
risk ratio
= 1
p1/p0
= 0.77
0.62, 0.96 0.016
Categorical Population
odds ratio
= 1
(d1/h1) / (d0/h0)
= 0.66
0.46, 0.93 0.021
33. Using Stata – p-value calculated using Chi-squared test
. csi 87 112 187 159, or
| Exposed Unexposed | Total
-----------------+------------------------+------------
Cases |
Noncases |
87
187
112
159
|
|
199
346
-----------------+------------------------+------------
Total |
|
|
274 271 |
|
|
545
Risk .3175182 .4132841 .3651376
|
|
|
|Point estimate [95% Conf. Interval]
|------------------------+------------------------
Risk difference
Risk ratio
Prev. frac. ex.
Prev. frac. pop
Odds ratio
|
|
|
|
|
-.0957659
.7682808
.2317192
.1164974
.6604756
|
|
|
|
|
-.1762352
.6139856
.0386495
-.0152966
.9613505
.3860144
.4652544 .937623 (Cornfield)
+-------------------------------------------------
chi2(1) = 5.39 Pr>chi2 = 0.0202
For OR, by default Stata uses Cornfield’s formula for se. You can request
the Woolf formula as csi 87 112 187 159, or woolf
34. Lecture 9 - Objectives
• Calculate and interpret the measures of
association and their
and test statistics
confidence intervals
–
–
–
Risk difference
Risk ratio
Odds ratio