RSS 2013 - A re-analysis of the Cochrane Library data]

Background
Methods
Results
Summary
A re-analysis of the Cochrane Library data
the dangers of unobserved heterogeneity in meta-analyses
Evan Kontopantelis12 David Springate12 David
Reeves12
1NIHR School for Primary Care Research, University of Manchester
2Centre for Biostatistics, Institute of Population Health, University of Manchester
RSS Newcastle, 3 Sep 2013
Kontopantelis A re-analysis of the Cochrane Library data

Background
Methods
Results
Summary
Outline
1 Background
2 Methods
Data
Analyses
3 Results
Method performance
Cochrane data
4 Summary

Background
Methods
Results
Summary
Meta-analysis
Synthesising existing evidence to answer clinical questions
Relatively young and dymanic ﬁeld of research
Activity reﬂects the importance of MA and potential to
provide conclusive answers
Individual Patient Data meta-analysis is the best option,
but considerable cost and access to patient data required
When original data unavailable, evidence combined in a
two stage process
retrieving the relevant summary effect statistics
using MA model to calculate the overall effect estimate ˆµ

Background
Methods
Results
Summary
Heterogeneity estimate
or between-study variance estimate ˆτ2
Model selection depends on the heterogeneity estimate
If present usually a random-effects approach is selected
But a ﬁxed-effects model may be chosen for theoretical or
practical reasons
Different approaches for combining study results
Inverse variance
Mantel-Haenszel
Peto

Background
Methods
Results
Summary
Meta-analysis methods
Inverse variance: fixed- or random-effects & continuous or
dichotomous outcome
DerSimonian-Laird, moment based estimator
Also: ML, REML, PL, Biggerstaff-Tweedie,
Follmann-Proschan, Sidik-Jonkman
Mantel-Haenszel: fixed-effect & dichotomous outcome
odds ratio, risk ratio or risk difference
different weighting scheme
low events numbers or small studies
Peto: fixed-effect & dichotomous outcome
Peto odds ratio
small intervention effects or very rare events
if ˆτ2 > 0 only modelled through inverse variance weighting

Background
Methods
Results
Summary
Random-effects (RE) models
Accurate ˆτ2 important performance driver
Large ˆτ2 leads to wider CIs
Zero ˆτ2 reduces all methods to ﬁxed-effect
Three main approaches to estimating:
DerSimonian-Laird (ˆτ2
DL)
Maximum Likelihood (ˆτ2
ML)
Restricted Maximum Likelihood (ˆτ2
REML)
All other RE methods use one of these but vary in
estimating µ
In practice, ˆτ2
DL computed and heterogeneity quantiﬁed and
reported using Cochran’s Q, I2 or H2

Background
Methods
Results
Summary
Random or fixed?
two ‘schools’ of thought
Fixed-effect (FE)
‘what is the average result of trials conducted to date’?
assumption-free
Random-effects (RE)
‘what is the true treatment effect’?
various assumptions
normally distributed trial effects
varying treatment effect across populations although findings
limited since based on observed studies only
more conservative; findings potentially more generalisable
Researchers reassured when ˆτ2 = 0
FE often used when low heterogeneity detected

Background
Methods
Results
Summary
Cochrane Database for Systematic Reviews
Richest resource of meta-analyses in the world
Fifty-four active groups responsible for organising, advising
on and publishing systematic reviews
Authors obliged to use RevMan and submit the data and
analyses ﬁle along with the review, contributing to the
creation of a vast data resource
RevMan offers quite a few ﬁxed-effect choices but only the
DerSimonian-Laird random-effects method has been
implemented to quantify and account for heterogeneity

Background
Methods
Results
Summary
The questions
Investigate the potential bias when assuming ˆτ2 = 0
Compare the performance of τ2
estimators in various
scenarios
Present the distribution of ˆτ2
derived from all
meta-analyses in the Cochrane Library
Present details on the number of meta-analysed studies,
model selection and zero ˆτ2
Assess the sensitivity of results and conclusions using
alternative models

Background
Methods
Results
Summary
Data
Analyses
‘Real’ Data
Cochrane Database for Systematic Reviews
Python code to crawl Wiley website for RevMan files
Downloaded 3,845 relevant RevMan files (of 3,984
available in Aug 2012) and imported in Stata
Each file a systematic review (e.g. olanzapine for
schizophrenia)
Within each file, various research questions might have
been posed (e.g. vs placebo, vs typical antipsychotics)
investigated across various relevant outcomes? (e.g. no
clinical response, adverse events)
variability in intervention or outcome? (e.g. drug dosage,
types of adverse events)

Background
Methods
Results
Summary
Data
Analyses
Simulated Data
Generated effect size Yi and within study variance
estimates ˆσ2
i for each simulated meta-analysis study
Distribution for ˆσ2
i based on the χ2
1 distribution
For Yi (where Yi = θi + ei)
assumed ei ∼ N(0, ˆσ2
i )
various distributional scenarios for θi : normal, moderate
and extreme skew-normal, uniform, bimodal
three τ2
values to capture low (I2
= 15.1%), medium
(I2
= 34.9%) and large (I2
= 64.1%) heterogeneity
For each distributional assumption and τ2 value, 10,000
meta-analysis cases simulated

Background
Methods
Results
Summary
Data
Analyses
Between-study variance estimators
frequentist, more or less
DerSimonian-Laird
one-step (ˆτ2
DL)
two-step (ˆτ2
DL2)
non-parametric bootstrap (ˆτ2
DLb)
minimum ˆτ2
DL = 0.01 assumed (ˆτ2
DLi )
Variance components
one-step (ˆτ2
VC)
two-step (ˆτ2
VC2)
Iterative
Maximum likelihood (ˆτ2
ML)
Restricted maximum likelihood (ˆτ2
REML)
Proﬁle likelihood (ˆτ2
PL)

Background
Methods
Results
Summary
Data
Analyses
Between-study variance estimators
Bayesian
Sidik and Jonkman model error variance
crude ratio estimates used as a-priori values (ˆτ2
MV )
VC estimator used to inform a-priori values with minimum
value of 0.01 (ˆτ2
MVb)
Rukhin
prior between-study variance zero (ˆτ2
B0)
prior between-study variance non-zero and ﬁxed (ˆτ2
BP)

Background
Methods
Results
Summary
Data
Analyses
Assessment criteria
in the 10,000 meta-analysis cases for each simulation scenario
Average bias & average absolute bias in ˆτ2
Percentage of zero ˆτ2
Coverage probability for the effect estimate
Type I error
proportion of 95% CIs for the overall effect estimate that
contain the true overall effect θi
Error-interval estimation for the effect
quantiﬁes accuracy of estimation of the error-interval
around the point estimate
ratio of estimated conﬁdence interval for the effect,
compared to the interval based on the true τ2

Background
Methods
Results
Summary
Method performance
Cochrane data
Which method?
Performance not affected much by effects’ distribution
Absolute bias
B0 (k ≤ 3) and ML
Coverage
MVa-BP (k ≤ 3) and DLb
Error-interval estimation and detecting
DLb
DLb seems best method overall, especially in detecting
heterogeneity
appears to be a big problem: DL failed to detect high τ2
for
over 50% of small meta-analyses
Bayesian methods did well for very small MAs

Background
Methods
Results
Summary
Method performance
Cochrane data
Meta-analyses numbers
Of the 3,845 ﬁles 2,801 had identiﬁed relevant studies and
contained any data
98,615 analyses extracted 57,397 of which meta-analyses
32,005 were overall meta-analyses
25,392 were subgroup meta-analyses
Estimation of an overall effect
Peto method in 4,340 (7.6%)
Mantel-Haenszel in 33,184 (57.8%)
Inverse variance in 19,873 (34.6%)
random-effects more prevalent in inverse variance methods
and larger meta-analyses
34% of meta-analyses on 2 studies (53% k ≤ 3)!

Background
Methods
Results
Summary
Method performance
Cochrane data
Meta-analyses by Cochrane group
Figures
Figure 1: All meta-analyses, including single-study and subgroup meta-analyses
0
2000
4000
6000
8000
10000
12000
14000
PregnancyandChildbirth
Schizophrenia
Neonatal
MenstrualDisordersandSubfertility
DepressionAnxietyandNeurosis
Airways
Hepato-Biliary
FertilityRegulation
Musculoskeletal
Stroke
AcuteRespiratoryInfections
Renal
DementiaandCognitiveImprovement
PainPalliativeandSupportiveCare
InfectiousDiseases
Heart
BoneJointandMuscleTrauma
MetabolicandEndocrineDisorders
GynaecologicalCancer
DevelopmentalPsychosocialandLearning…
ColorectalCancer
Hypertension
Anaesthesia
HaematologicalMalignancies
DrugsandAlcohol
Incontinence
InflammatoryBowelDiseaseandFunctional…
MovementDisorders
NeuromuscularDisease
OralHealth
PeripheralVascularDiseases
BreastCancer
TobaccoAddiction
CysticFibrosisandGeneticDisorders
Back
Skin
HIV/AIDS
Injuries
EyesandVision
Wounds
EarNoseandThroatDisorders
Epilepsy
UpperGastrointestinalandPancreaticDiseases
EffectivePracticeandOrganisationofCare
ProstaticDiseasesandUrologicCancers
MultipleSclerosisandRareDiseasesofthe…
MultipleSclerosis
ConsumersandCommunication
LungCancer
SexuallyTransmittedDiseases
ChildhoodCancer
OccupationalSafetyandHealth
SexuallyTransmittedInfections
PublicHealth
Single Study Fixed-effect model (by choice or necessity) Random-effects model

Background
Methods
Results
Summary
Method performance
Cochrane data
Meta-analyses by method choice
Figure 2: Model selection by number of available studies (and % of random-effects meta-analyses)*
*note that in many case fixed-effect models were used when heterogeneity was detected
Figure 3: Comparison of zero between-study variance estimates rates in the Cochrane library data and in simulations,
21%
27%
31%
37%
41%
51%
15%
19%
22%
22%
27%
30%
0
2000
4000
6000
8000
10000
12000
2 3 4 5 6-9 10+
Number of Studies in meta-analysis
Peto (FE) Inverse Variance (FE) Inverse Variance (RE) Mantel-Haenszel (FE) Mantel-Haenszel (RE)

Background
Methods
Results
Summary
Method performance
Cochrane data
Comparing Cochrane data with simulated
To assess the validity of a homogeneity assumption we
compared the percentage of zero ˆτ2
DL, in real and
simulated data
Calculated ˆτ2
DL for all Cochrane meta-analyses
Percentage of zero ˆτ2
DL was lower in the real data than in
the low and moderate heterogeneity simulated data
Suggests that mean true between-study variance is higher
than generally assumed but fails to be detected; especially
for small meta-analyses

Background
Methods
Results
Summary
Method performance
Cochrane data
Comparing Cochrane data with simulated
*note that in many case fixed-effect models were used when heterogeneity was detected
Figure 3: Comparison of zero between-study variance estimates rates in the Cochrane library data and in simulations,
using the DerSimonian-Laird method*
*Normal distribution of the effects assumed in the simulations (more extreme distributions produced similar
results).
Peto (FE) Inverse Variance (FE) Inverse Variance (RE) Mantel-Haenszel (FE) Mantel-Haenszel (RE)
0
10
20
30
40
50
60
70
80
90
100
2 3 4 5 10 20
%ofzeroτ^2estimateswithDerSimonian-Laird
Number of studies in meta-analyis
Observed
true τ^2=0.01
true τ^2=0.03
true τ^2=0.10

Background
Methods
Results
Summary
Method performance
Cochrane data
Reanalysing the Cochrane data
We applied all methods to all 57,397 meta-analyses to
assess ˆτ2 distributions and the sensitivity of the results
and conclusions
For simplicity discuss differences between standard
methods and DLb; not a perfect method but one that
performed well overall
As in simulations, DLb identiﬁes more heterogeneous
meta-analyses; ˆτ2
DL = 0 for 50.5% & ˆτ2
DLb = 0 for 31.2%
Distributions of ˆτ2 agree with the hypothesised χ2
1

Background
Methods
Results
Summary
Method performance
Cochrane data
Distributions for ˆτ2
0500100015002000
#ofmeta-analyses
0 .1 .2 .3 .4 .5
t
2
estimate
Zero est(%): DL=44.9, DLb=29.6, VC=48.9 REML=45.4
ML=62.2, B0=49.2, VC2=44.3, DL2=45.3
Non-convergence(%): ML=0.7, REML=1.4.
Inverse Variance
010002000300040005000
#ofmeta-analyses
0 .1 .2 .3 .4 .5
t
2
estimate
ML=75.0, B0=59.6, VC2=53.9, DL2=55.5
Mantel-Haenszel
0200400600
#ofmeta-analyses
0 .1 .2 .3 .4 .5
t
2
estimate
ML=70.0, B0=54.8, VC2=49.6, DL2=51.0
Peto & O-E
02000400060008000
#ofmeta-analyses
0 .1 .2 .3 .4 .5
t
2
estimate
ML=70.2, B0=55.6, VC2=50.2, DL2=51.6
all methods
non-zero estimates only
DL DLb VC ML
REML B0 VC2 DL2

Background
Methods
Results
Summary
Method performance
Cochrane data
Changes in results and conclusions
Inverse variance with DLb
when ˆτ2
DL = 0, conclusions change for 0.9% of analyses
when ˆτ2
DL > 0 and not ignored, conclusions change for
2.4% of analyses
when ˆτ2
DL > 0 but ignored, conclusions change for 19.1% of
analyses
in overwhelming majority of changes (19.0%, 0.8%, 2.3%),
effects stopped being statistically signiﬁcant
Findings were similar for Mantel-Haenszel and Peto
methods, although the validity of the inverse variance
weighting in these (which is a prerequisite for the use or
random-effects models) warrants further investigation

Background
Methods
Results
Summary
Findings
Methods often fail to detect τ2 in small MA
Even when ˆτ2 > 0, often ignored
Mean true heterogeneity higher than assumed or
estimated; but standard method fails to detect it
Non-parametric DerSimonian-Laird bootstrap seems best
method overall, especially in detecting heterogeneity
Bayesian estimators MVa (Sidik-Jonkman) and BP
(Ruhkin) performed very well when k ≤ 3
19-21% of statistical conclusions change, when ˆτ2
DL > 0
but ignored

Background
Methods
Results
Summary
Conclusions
Detecting and accurately estimating ˆτ2 in a small MA is
very difﬁcult; yet for 53% of Cochrane MAs, k ≤ 3
ˆτ2 = 0 assumed to lead to a more reliable meta-analysis
and high ˆτ2 is alarming and potentially prohibitive
Estimates of zero heterogeneity should also be a concern
since heterogeneity is likely present but undetected
Bootstrapped DL leads to a small improvement but
problem largely remains, especially for very small MAs
Caution against ignoring heterogeneity when detected
For full generalisability, random-effects essential?

Appendix Thank you!
yCareResearchGroup
[Poster tit
ABSTRACT
TITLE:
[Add text here.]
BACKGROUND:
[Add text here.]
OBJECTIVE:
[Add text here.]
METHODS:
[Add text here.]
RESULTS:
[Add text here.]
CONCLUSIONS:
[Add text here.]
BACKGROUND
[Add title, if necessary.]
Label One
[Replace the following names and titles with those of the actual contributors: Helge Hoeing, PhD1; Carol Phi
1[Add affiliation for first contributor], 2[Add affiliation for second contributor], 3[Add af
METHODS
 [Add key point.]
[Add description of key point.]
RESULTS
0% 20% 40% 60%
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
4th Qtr
[Replace, move, resize, or delete graphic, as necessary.]
[Replace, move, resize, or delete graphic, as necessary.]
[Replace, move, resize, or delete graphic, as neces
Excepteur Sint
Lkl
(n=212)
Controls
(n=27)
Lorum Wt (kg) 18 (SD 10) 29
(SD 07)
Ipsum (wk) 31 (SD 5) 37 (SD 2)
Irure: B
W
H
HB
O
Unknown
79 (373%)
121 (571%)
2 (09%)
0
1 (05%)
9 (42%)
7 (259%)
18 (667%)
0
1 (37%)
1 (37%)
0
Proident
F
106 (50%)
101 (476%)
17 (63%)
10 (37%)
Kontopantelis E, Springate D, Reeves D. A re-analysis of the Cochrane
Library data: the dangers of unobserved heterogeneity in
meta-analyses. PLoS ONE, 2013 July; 8(7): e69930.
doi:10.1371/journal.pone.0069930
This project is supported by the School for Primary Care Research
which is funded by the National Institute for Health Research (NIHR).
The views expressed are those of the author(s) and not necessarily
those of the NHS, the NIHR or the Department of Health.
Comments, suggestions: e.kontopantelis@manchester.ac.uk

RSS 2013 - A re-analysis of the Cochrane Library data]

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (15)

Similaire à RSS 2013 - A re-analysis of the Cochrane Library data]

Similaire à RSS 2013 - A re-analysis of the Cochrane Library data] (20)

Plus de Evangelos Kontopantelis

Plus de Evangelos Kontopantelis (20)

Dernier

Dernier (20)

RSS 2013 - A re-analysis of the Cochrane Library data]