Power Analysis and Sample Size Determination

Power Analysis and Sample Size
Determination

AK Dhamija

Researchers differ
A researcher conducted a study comparing the
effect of an intervention vs placebo on reducing
body weight, and found 5 kg reduction among
the intervention group with P=0.01.

Another researcher conducted a similar study
comparing the effect of the same intervention vs
the same placebo on reducing body weight, and
found the same 5 kg reduction with the
intervention group but could not claim that the
intervention was effective because P=0.35.

Agenda
Power
Sample Size Calculations
Examples
Changes in the basic formulae
Flaws in Statements

Power is Effected by…..
Variation in the outcome (σ2)
↓ σ2 → power ↑
Significance level (α)
↑α → power ↑
Difference (effect) to be detected (δ)
↑δ → power ↑
One-tailed vs. two-tailed tests
Power is greater in one-tailed tests than in
comparable two-tailed tests

Power Changes
2n = 32, 2 sample test, 81% power, δ=2,
σ = 2, α = 0.05, 2-sided test
Variance/Standard deviation
σ: 2 → 1 Power: 81% → 99.99%
σ: 2 → 3 Power: 81% → 47%
Significance level (α)
α : 0.05 → 0.01 Power: 81% → 69%
α : 0.05 → 0.10 Power: 81% → 94%

Power Changes
2n = 32, 2 sample test, 81% power, δ=2, σ = 2,
α = 0.05, 2-sided test
Difference to be detected (δ)
δ : 2 → 1 Power: 81% → 29%
δ : 2 → 3 Power: 81% → 99%
Sample size (n)
n: 32 → 64 Power: 81% → 98%
n: 32 → 28 Power: 81% → 75%
One-tailed vs. two-tailed tests
Power: 81% → 88%

Power Formula
Depends on study design
Not hard, but can be VERY algebra
intensive
May want to use a computer program or
statistician

How Big a Sample We Need?
Fundamental research question
Should be addressed after determining the primary
objective and study design
Too Few Patients in a clinical study
– May fail to detect a clinically important difference
Too Many
– Involve extra patients
– Therapy may have risks
– Cost more

How Big a Sample We Need?
Fundamentalresearch question
How Big?
18
180
1,800
18,000
180,000

Sample Size Formula Information
Variables of interest
type of data e.g. continuous, categorical
Desired power
Desired significance level
Effect/difference of clinical importance
Standard deviations of continuous outcome
variables
One or two-sided tests

Sample Size & Study Design
Randomized controlled trial (RCT)
Block/stratified-block randomized trial
Equivalence trial
Non-randomized intervention study
Observational study
Prevalence study
Measuring sensitivity and specificity

Sample Size & Data Structure
Paired data
Repeated measures
Groups of equal sizes
Hierarchical data

Sample Size
Non-randomized studies looking for differences or
associations
require larger sample to allow adjustment for confounding factors
Absolute sample size is of interest
surveys sometimes take % of population approach
Study’s primary outcome is the variable you do the sample
size calculation for
If secondary outcome variables considered important make sure
sample size is sufficient
Increase the ‘real’ sample size to reflect loss to follow up,
expected response rate, lack of compliance, etc.
Make the link between the calculation and increase

Steps
Step 1. Define Primary Objective
To see if feeding milk to 5 year old kids enhances
growth.
Step 2. Study Design
Extra Milk Diet

5 yr olds
Normal Milk Diet
Outcome: height (cm)
Step 3. Define clinically significant difference
one wishes to detect
Difference (∆) of 0.5 cm

Steps
Step 4. Define degree of certainty of finding this
difference
beta (β) or type II error : The probability of NOT detecting a
significant difference when there really is one.

Risk of a false-negative finding ie Risk of declaring no significant
difference in height between the milk diets when a difference
really does exist.

Set at ≤ 20%

Power of the Test: Probability of detecting a predefined clinically
significant difference.

Power = (1- β) = 1 -20% = 80%

Steps
Step 5. Define significance level
Alpha (α) or type I error: The probability of detecting a significant difference
when the treatments are really equally effective

Risk of a false-positive finding

Set at 5% :
One has a 5% chance or 1 in 20 odds of declaring a significant difference
between the milk diets when in fact they are really equal.

We are willing to accept that 1 time out of 20 we will produce a false
positive finding

For the Milk Study
Type I error (α) = 0.05
Type II error (β) = 0.20
Power = (1- β) = 0.80
Clinically significant diff (∆) = 0.5cm
Measure of variation (SD) = 2.0 cm
– Exists in literature or “Guesstimate”
Formula Beta

N = 2(SD)2 x f(α, β)
f(α Alpha 0.05 0.10 0.20 0.50
∆2 0.10 10.8 8.6 6.2 2.7
= 2(2)2 x 7.9 / 0.52
0.05 13.0 10.5 7.9 3.8

= 252.8 (each group) 0.02 15.8 13.0 10.0 5.4

0.01 17.8 14.9 11.7 6.6

Simple Method
Nomogram
Standardized difference
= smallest medically relevant diff
estimated standard deviation
= 0.5/2.0 = 0.25
Assumptions:
1. 2 sample comparison only
2. Same number of subjects
per group
3. Variable is a continuous
measure that is normally
distributed

500

1 sample test
Study Objective : Study effect of new sleep aid
Baseline to sleep time after taking the medication for one week
Two-sided test, α = 0.05, power = 1-β = 90%
Difference(δ) = 1 (4 hours of sleep to 5)
Standard deviation(σ) = 2 hr
2 2 2 2
( Z1 /2 Z1 ) (1.960 1.282) 2
n 2
42.04 43
12
Change δ from 1hr to 2 hr makes n goes from 43 to 11
2 2
(1.960 1.282) 2
n 2
10.51 11
2

1 sample test
Change power from 90% to 80% makes n goes from 11 to 8
(Small sample: start thinking about using the t distribution)
(1.960 0.841) 2 22
n 7.85 8
22
Change the standard deviation from 2 to 3 makes n goes from 8 to 18
2 2
(1.960 0.841) 3
n 2
17.65 18
2

Sleep Aid Example: 2 Sample
Original design (2-sided test, α = 0.05, 1-β = 90%, σ = 2hr, δ = 1 hr)
Two sample randomized parallel design
Needed 43 in the one-sample design
In 2-sample need twice that, in each group!
4 times as many people are needed in this design
2( Z1 /2 Z1 ) 2 2
2(1.960 1.282)2 22
n 2
84.1 85 170 total!
12
Change δ from 1hr to 2 hr makes n goes from 72 to 44
2(1.960 1.282) 2 22
n 21.02 22 44 total
22

Sleep Aid Example: 2 Sample
Change power from 90% to 80% makes n goes from 44 to 32
2(1.960 0.841)2 22
n 15.69 16 32 total
22
Change the standard deviation from 2 to 3 makes n goes from 32 to 72

2(1.960 0.841)2 32
n 35.31 36 72 total
22

Summary

Changes in the detectable difference have
HUGE impacts on sample size
20 point difference → 25 patients/group

Changes in α, β, σ, number of samples, if it is a 1-
or 2-sided test can all have a large impact on your
sample size calculation

Matched Pair Designs
Similar to 1-sample formula
Means (paired t-test)
Mean difference from paired data
Variance of differences
Proportions
Based on discordant pairs

Difference in Proportion
Study Objective
To increase survival by 5% with a new cancer drug
P1 = % survival (std) = 85%
P2 = % survival (new) = 90%
Power = 90%

N = P1 (100 - P1) + P2 (100 - P2) x f (α, β) = 913.5 (each group)
(P2 - P1)2

= 1827 Total

A very large study has the power to demonstrate statistical
significance for very small, even clinically inconsequential
differences.

Changes in basic formulae
Unequal #s in Each Group
Ratio of cases to controls
Use if want λ patients randomized to the treatment arm for every patient randomized
to the placebo arm

Take no more than 4-5 controls/case

n2 n1 controls for every case
2 2 2
( Z1 /2 Z1 ) ( 1 2 / )
n1 2

# of Covariates & # of Subjects
At least 10 subjects for every variable investigated
In logistic regression
No general justification
This is stability, not power
Peduzzi et al., (1985) biased regression coefficients and
variance estimates

Principle component analysis (PCA) (Thorndike
1978 p 184): N≥10m+50 or even N ≥ m2 + 50

Balanced Designs: Easier
Equal numbers in two groups is the easiest
to handle
If you have more than two groups, still,
equal sample sizes easiest
Complicated design = simulations
Done by the statistician

Multiple Comparisons
If you have 4 groups
All 2 way comparisons of means
6 different tests
Bonferroni: divide α by # of tests
0.025/6 ≈ 0.0042
High-throughput laboratory tests

Flaws in Statements
"A previous study in this area recruited 150 subjects and found highly
significant results (p=0.014), and therefore a similar sample size should
be sufficient here."
Previous studies may have been 'lucky' to find significant results, due to random
sampling variation.
"Sample sizes are not provided because there is no prior information on
which to base them."
Find previously published information
Conduct small pre-study
If a very preliminary pilot study, sample size calculations not usually necessary

No prior information on standard deviations
Give the size of difference that may be detected in terms of number of standard
deviations

Roadmap
1. Do a sample size calculation before you start
collecting data
2. Collect data
3. Perform statistical test : IF p value < 0.05, declare
statistical significance
4. Consider clinical significance by looking at the size of
the difference

References

“Sample Size Estimation”, Phil Hahn Queen’s
University
”Sample Size and Power”, Laura Lee Johnson,
Ph.D., Statistician, National Center for
Complementary and Alternative Medicine
”Sample Size Estimation and Power Analysis”,
Ayumi Shintani, PhD, MPH Department of
Biostatistics, Vanderbilt University

Power Analysis and Sample Size Determination

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Power Analysis and Sample Size Determination

Similaire à Power Analysis and Sample Size Determination (20)

Plus de Ajay Dhamija

Plus de Ajay Dhamija (15)

Dernier

Dernier (20)

Power Analysis and Sample Size Determination