Prediction research Twente 22June22 sel.pptx

June 22, 2022
Towards better performance of prediction models:
updating and extension with markers
: updating and marker
Ewout W. Steyerberg, PhD
Professor of Clinical Biostatistics and
Medical Decision Making
Dept of Biomedical Data Sciences
Leiden University Medical Center
Thanks to many, including Ben van Calster, Leuven

Key question: how to improve prediction models?
1. Better development + validation
a) Sample size
b) Methods
2. Updating of existing models
a) Local settings
b) Continuous learning
3. Extension with markers
4. Machine learning
22-Jun-22
2 Insert > Header & footer

Subquestion 1: How to assess model improvement?
1. Calibration (A + B, intercept + slope)
2. Discrimination (C, concordance)
3. Clinical usefulness (D, decision-analytic)
22-Jun-22

Multiple reviews
22-Jun-22

Calibration
22-Jun-22

BMJ style, understandable calibration plots
22-Jun-22
BMJ. 2012 Jun 21;344:e4181. doi: 10.1136/bmj.e4181.

With uncertainty
22-Jun-22
val.prob.ci.2() in R, based on Harrell’s val.prob() function

Decision-analytic perspectives
If we are serious about “using different thresholds that allow the
operator of the model to trade-off concerns in the errors made by
the model” we need a decision-analytic perspective
1. Define threshold
2. Evaluate quality of classification
 Decision Curve
22-Jun-22

BMJ style, understandable decision curves
22-Jun-22

22-Jun-22
Med Decis Making 2006;26:565–574

3 statements on Decision Curve Analysis (DCA)
1. A classic idea (1884 or older)
2. A good link with clinical context:
benefit of treatment vs harm by overtreatment to define thresholds
3. A good graphic because thresholds are ‘subjective’
22-Jun-22

Youden index and Net Benefit; Peirce, Science 1884
Event
Test: answer + –
+ aa ab
– ba bb
TP FP
sens spec
Youden index: sens + spec – 1

Vickers & Elkin, MDM 2006
22-Jun-22
Benefit: a – c
Harm: d – b
Odds of threshold:
Harm / Benefit

Net Benefit
Net Benefit = (TP – w FP)/N
w = harm / benefit ratio = threshold/ (1 – threshold)
• e.g.: threshold 50%: w = .5/.5=1;
threshold 20%: w=.2/.8=1/4
“Fraction of true-positive classifications,
penalized for false-positive classifications”
BMJ 2016;352:i6 doi: 10.1136/bmj.i6.

Illustration for CVD risk
22-Jun-22

BMJ style, understandable text
22-Jun-22

BMJ style, understandable decision curves
22-Jun-22

a) Sample size
b) Methods
a) Local settings
22-Jun-22

Subquestion 2: How to balance global vs local models?
Prediction models need updating to local settings;
can we entertain the idea of a ‘global model’?
1. Global: baseline risk + predictor effects
2. Recalibrated: local baseline risk + global predictor effects
3. Refitted: local baseline risk + local predictor effects
22-Jun-22

Examples on updating
Single validation set
Robust approach + closed testing
22-Jun-22

Start “off the shelf”, update continuously
22-Jun-22

Examples on updating
Single validation set
Classic: approach SiM 2004; closed testing
Dynamic in calendar time
Multiple validation sets
Assess heterogeneity
a) Global model?
b) Fair representation of uncertainty?
22-Jun-22

22-Jun-22

Another advertisement: internal-external validation
22-Jun-22

a) Sample size
b) Methods
a) Local settings
22-Jun-22

Cardiovascular risk without / with HDL
22-Jun-22

Incremental value of marker
• Define a reference model, add marker to evaluate incremental value
• Regression coefficient problematic (scaling); p-value assumed to be low
• Increase in AUC / c statistic usually small (typically: +0.01)
 Push to look beyond AUC: reclassification
22-Jun-22

29
7
173
174
22/183=12%
1/3081=.03%

NRI and delta(AUC) for binary classification
NRI = delta(sens) + delta(spec)
AUC for binary classification = (sens + spec) / 2
delta(AUC) = (delta(sens) + delta(spec)) / 2
NRI = 2 x delta(AUC)

NRI has ‘absurd’ weighting?

Decision-analytic variants
Weighted NRI
Delta NB (Vickers)
Delta Relative Utility (Baker) / standardized NB (Pepe / Janes)
22-Jun-22

Marker evaluation
NRI was a historical mistake?
Net benefit to the rescue?
22-Jun-22

Summary 20 June 2022
1. Prediction modeling research challenging
2. Performance assessment: calibration and Net Benefit
3. Improving performance:
a) Updating
b) Markers
c) Machine learning
22-Jun-22

Prediction research Twente 22June22 sel.pptx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Prediction research Twente 22June22 sel.pptx

Similar to Prediction research Twente 22June22 sel.pptx (20)

Recently uploaded

Recently uploaded (20)

Prediction research Twente 22June22 sel.pptx