1. A D L T 6 7 3 : T E A C H I N G A S S C H O L A R S H I P I N
M E D I C A L E D U C A T I O N
M O N D A Y , M A R C H 3 1 , 2 0 1 4
An Overview of Quantitative
Data Analysis
2. Outline of Today’s Class
Analytic Methods
Summary Measures
Hypothesis Testing
Statistical Methodologies
Group Discussion
Sample Size Determination
Group Discussion
Additional Resources
3. Analytic Methods: Summary Measures
Representative Measures
Reflect the most “typical” or “average” data value.
Continuous Measurements:
Mean (Average), Median and Mode
Categorical Measurements:
Frequencies and Proportions
4. Analytic Methods: Summary Measures
Measures of Variability
Reflect how much the values differ from one another.
Continuous Measurements:
Standard deviation, range, interquartile range
Categorical Measurements:
None that are meaningful (sorry!)
7. Analytic Methods: Summary Measures
Measures of Association
Continuous Measures: Correlation Coefficient (ρ): -1 < ρ < 1
Correlations close to 1 indicate two measurements are highly
predictive and “track” with one another.
Correlations close to -1 indicate two measurements are highly
predictive and have inverse relationship.
Correlations close to 0 indicate little association.
Categorical Measures: Odds Ratio (OR): 0 < OR < ∞
OR greater than 1 indicates outcome (e.g., passed test) more likely
in test group than in control.
OR less than 1 indicates outcome less likely in test group than in
control.
OR ≈ 1 indicates little difference in outcomes between groups.
8. Analytic Methods: Hypothesis Testing
Most commonly accepted format of providing
quantitative evidence.
Consists of 5 Steps:
Translate research question into a set of testable hypotheses.
Select most appropriate statistical test for your hypotheses.
Collect your data.
Calculate test statistic and/or p-value.
Make Decision.
9. Analytic Methods: Hypothesis Testing
Translating Research Question into Testable Hypotheses
Identify parameter: population Mean (μ), proportion (p) or
difference (e.g., μ1-μ2).
Identify statements made about that parameter.
Should be in the form of: <, ≤, >, ≥, = or ≠
Write research question in symbolic form, and find its opposite.
Opposite of “<“ is “≥”
“≤” is opposite of “>”
“≠” is opposite of “=“
10. Analytic Methods: Hypothesis Testing
Example:
Does an active learning curriculum improve the proportion of
students passing their board examinations compared to
students receiving the standard curriculum?
Parameter: proportion passing board exams p
Statement: pactive is greater than pstandard
Symbolic Form: pactive > pstandard or pactive – pstandard > 0
Opposite of Symbolic Form: pactive ≤ pstandard or pactive – pstandard ≤ 0
11. Analytic Methods: Hypothesis Testing
Testable Hypotheses:
Null Hypothesis: Statement that parameter (or difference) is equal
to zero.
Any statement in symbolic form with a ≤, ≥ or = is automatically the
null (note: we replace ≤ or ≥ with 0).
Alternative Hypothesis: Statement that parameter (or difference) is
somehow different from zero.
Any statement in symbolic form with a <, > or ≠ is automatically the
alternative.
Example:
pactive – pstandard > 0 becomes the alternative (HA)
pactive – pstandard ≤ 0 becomes the null (H0)
12. Analytic Methods: Hypothesis Testing
Make Decision
Based on statistical methodology you use, you get a p-value.
Probability of observing outcomes that are more extreme than the
data you actually observed, given the null hypothesis is true.
Plain English: If your study was ineffective, p-value is the probability
of observing more extreme results than what you observed.
If this probability is high, then your results match with the null
hypothesis, and you fail to reject the null (intervention didn’t work)
If this probability is low, then your results do not seem to match the
null hypothesis, and you reject the null (intervention likely
worked).
In practice: we compare p-value to significance level (α = 0.05).
If p-value ≥ 0.05, we fail to reject the null.
If p-value < 0.05, we reject the null.
13. Analytic Methods: Continuous Data
# of Measurements
# of Samples Single Pre/Post Repeated Measures
1 Sample t-test Paired t-test Repeated Measures ANOVA
(RMA) / Linear Mixed
Model (LMM)*
2 Samples Two-sample
t-test
RMA / LMM* RMA / LMM*
“k” Samples Analysis of
Variance
(ANOVA)
RMA / LMM* RMA / LMM*
Adjusting for
Covariates:
Multiple Linear Regression*, Analysis of Covariance
(ANCOVA)*, Linear Mixed Models*
*Will likely require statistical assistance
14. Analytic Methods: Categorical Data
# of Measurements
# of Samples Single Pre/Post Repeated Measures
1 Sample z-test McNemar’s
Test
Generalized Linear
Mixed Models (GLMM)*
2 Samples Chi-square
Test
GLMM* GLMM*
“k” Samples Chi-square
Test
GLMM* GLMM*
Adjusting for
Covariates:
Multiple Logistic Regression*, Generalized Linear
Mixed Models*
*Will likely require statistical assistance
15. Analytic Methods: Group Discussion
Please break into groups by table
For the next 10-15 minutes, take turns discussing what
analytic approaches are appropriate for your proposed study.
What are your null and alternative hypotheses?
Is your outcome continuous or categorical?
How many groups and measurements?
If your study is qualitative, discuss how statistical
methodologies could be used (e.g. data summary,
association).
16. Sample Size Determination
As a general rule, larger sample sizes:
Lead to more representative samples
Lead to better estimation of parameters (e.g., representative
measures)
Provide estimators with lower variability
N=9 N=36 N=100
17. Sample Size Determination
Averages over 10,000 Simulations
Sample Size Sample
Mean
Sample Std.
Dev.
Standard
Error*
9 204.4 36.5 12.3
16 204.3 37.1 9.5
25 204.2 37.2 7.8
36 204.1 37.5 6.5
49 204.1 37.6 5.5
64 204.2 37.7 4.9
81 204.1 37.7 4.2
100 204.1 37.7 3.9
1000 204.1 37.7 1.2
*SE: explains variability in estimator; not the sample data
18. Sample Size Determination
Possible Decisions
Power = 1 - β
True State
Decision H0 is “True” HA is True
Reject H0 Type I Error
α
Correct
Decision
Fail to Reject H0 Correct
Decision
Type II Error
β
19. Sample Size Determination
Determinants of Required Sample Size
Significance Level (α): probability of rejecting H0 when it is
true.
Power (1-β): probability of failing to reject H0 when it is false.
These values are selected during design phase
α = 5%
1-β = 80% (sometimes 90%).
20. Sample Size Determination
Determinants of Required Sample Size
Measure of variability (usually standard deviation) inherent
in study population.
As data become more variable…
Standard error of Test statistic increases…
p-value increases…
Ability to reject H0 decreases…
Power decreases.
Controlling variability:
Better measurement methodology
Homogeneous samples
21. Sample Size Determination
Determinants of Required Sample Size
Effect Size: smallest difference or change in outcome that you are
hoping to find
As difference you want to observe decreases…
Test statistic decreases…
p-value increases…
Ability to reject H0 decreases…
Power decreases.
Considerations:
Clinical significance
Clinical possibility (larger differences are easier to detect and harder
to find)
22. Sample Size Determination
Calculating Required Sample Size
Equations exist (involving α, β, variability and effect size) for
simple analytic methods (t-test, chi-square, etc.).
Advanced methods require professional assistance.
Where do you find variability and effect size?
Previous literature of similar populations
Pilot study
Guess-timates
23. Sample Size Determination
What if required sample size is too large?
Consider a different outcome
Continuous measures generally require smaller sample sizes than
categorical measures
Consider multiple sections or sites
Will require more sophisticated analytic methods
Reconfigure study as a “pilot”
Emphasis switches from “hypothesis testing” to “estimation” and
“data summary”
Goal is to provide data summaries and estimate confidence intervals
Summaries can be used to power larger study
24. Sample Size Determination: Group Discussions
Please break into groups by table.
For the next 10-15 minutes, take turns discussing:
Whether you will be able to power your study.
Where to find information to perform power analysis.
Your options if you are unable to adequately power your study.
25. Additional Resources
VCU Department of Biostatistics
18 full-time faculty
Can assist with: study design, sample size
determination, interim and final analyses, dissemination
Grant funding (or prospects of funding) usually required.
BIOS 516 Biostatistical Consulting: graduate students available
for FREE consultations
Contact Russ Boyle (boyle@vcu.edu) and provide a protocol.
26. Additional Resources
VCU Center for Clinical and Translation Research
Research Incubator: study design, sample size determination,
and other resources (e.g. grant writing)
Contact: Pam Dillon (pmdillon@vcu.edu)
Biomedical Informatics: data management and storage (e.g.
REDCAP)
Support requested online:
(http://www.cctr.vcu.edu/informatics/index.html)
27. Additional Resources
Textbooks (i.e., shameless plug):
Statistical Research Methods: A Guide for Non-Statisticians
Sabo and Boone, Springer, 2013
Available on the web ($45-$65):
http://www.springer.com/statistics//life+sciences,+medicine
+%26+health/book/978-1-4614-8707-4
http://www.amazon.ca/Statistical-Research-Methods-Guide-
Non-Statisticians/dp/1461487072