Factors Influencing Decisions by NICE

The Influence of Cost-Effectiveness and
Other Factors on NICE Decisions
OHE Lunchtime Seminar
London • 23 April 2013

The influence of cost-effectiveness
and other factors on NICE decisions
Helen Dakin1 and Nancy Devlin2
in collaboration with
Yan Feng2, Nigel Rice3, Phill O’Neill2 and David Parkin2
1University of Oxford, 2OHE, 3Unversity of York

NICE
 Established 1999
 Issues guidance to 'ensure quality and value for money'
 For technology appraisals 'NHS is required to provide
funding and resources for medicines and treatments
recommended by NICE'
 Important implications for patients, the NHS, industry,
decision making in other countries
 What factors affect NICE decisions? How important is
cost effectiveness compared to ‘other factors’?

NICE’s stated threshold
 Pre-2004: various statements suggest a threshold of around
£30,000
 2004 methods guide & 2005 social value judgements
• '[NICE] should, generally, accept as cost effective those interventions
with an incremental cost-effectiveness ratio of less than £20,000 per
QALY and that there should be increasingly strong reasons for
accepting as cost effective interventions with an incremental cost-
effectiveness ratio of over £30,000 per QALY
 2008 social value judgements
• NICE should explain its reasons when it decides that an intervention
with an ICER below £20,000 per QALY gained is not cost effective;
and when an intervention with an ICER of more than £20,000 to
£30,000 per QALY gained is cost effective

Previous studies on
determinants of NICE decisions
Study No.
Appraisals
Model Findings
Devlin &
Parkin (2004)
39 to May
2002
Logistic:
yes/no
• Threshold ≈£40,000/QALY
• Uncertainty & prevalence matter
Dakin et al
(2006)
73 to Dec
2003
mlogit: yes /
no / yes, but
• ICER, number trials or SRs, date, patient group
submissions and technology type matter
• ‘Yes, but’ and ‘no’ driven by different factors
Jena (2009) 86 to 2005 Linear
probability
model:
yes/no
• £1000 increase in ICER decreases probability
of ‘yes’ by 0.009
• Infectious disease and mental health decisions
more likely to be ‘no’
Mason &
Drummond
(2009)
38 cancer
appraisals
Oct 2008
Tabulation:
yes / no /
yes, but
• ‘No’ more likely after 2006: partly due to STA
process?
• Restrictions attributed to ICER, insufficient
evidence, uncertainty or methodology

Aims
 Estimate NICE’s cost-effectiveness threshold
 Identify the factors that affect or explain NICE’s decisions
 Evaluate whether NICE’s threshold or decision-making has changed
over time [in progress]

The basic model
 Incremental cost-effectiveness ratio (ICER)
• Hypothesised to be main driver of decisions
• ICER in £000s
 Clinical evidence
• ‘NICE should not recommend an intervention […] if there is […] not
enough evidence’ (SVJ 2008)
• Hypothesis: NICE will reject if insufficient evidence
• Total number of patients in randomised trials = number RCTs x mean
patients/trial
 Insights provided by stakeholders (Rawlins 2010)
• E.g. on whether QoL assessment adequately captures benefits (SVJ 2008)
• Hypothesis: increases odds of ‘yes’
• Patient group submission =1 if patient group submitted evidence or
opinion (proxy for stakeholder involvement/persuasion)

The basic model (cont.)
 Only treatment
• Hypothesis: NICE is more likely to recommend if no alternatives
• =1 if there were no alternative treatments available for this patient group
 Children
• Give ‘the benefit of the doubt’ given methodological challenges (Rawlins 2010)
• Hypothesis: paediatric treatments more likely to be recommended
• =1 if the decision specifically concerns children or adolescents
 Publication date
• Evaluates whether NICE decisions are changing
• No prior hypothesis
• =Years since first NICE appraisal was published
 Severity of underlying illness
• NICE states that is accept higher ICERs for serious conditions (Rawlins 2010)
• = Mean WHO DALY weight across conditions for this disease category

Additional variables explored
 Pharmaceutical
• May reflect greater stakeholder involvement
• =1 for all drugs
 Disease
• Interim analysis suggests NICE gives extra weight to cancer treatments
• 8 dummies =1 if the decision concerns that disease
• Diseases with <20 decisions with ICERs omitted
 Probabilistic sensitivity analysis (PSA)
• Significant predictor of AWMSG decisions (Linley & Hughes 2012)
• =1 if the model has PSA
 Broader perspective
• Reflects consideration of additional savings not captured in ICER
• =1 if non-NHS/PSS costs were analysed or discussed

Additional variables explored (cont.)
 Appraisal committee
• Because committees weigh both quantitative evidence and value
judgements, the way they do this might differ systematically across
committees.
• Committees characterised by their Chairs and included as dummies
 Innovation
• NICE says that it takes into account the innovative nature of the
technology
• Defined by us as: any molecule launched within 2 years of appraisal
AND in an ATC4 class that was created 5 years prior to appraisal.
This picks up new medicines, but also avoids limiting to first in class as
NICE does assess groups of similar medicines that are new and would
capture the spirit of viewing a medicines as innovative – e.g. new
diabetes medicines, TNFs, etc.

Additional variables explored (cont.)
 Single technology appraisal (STA)
• Mason & Drummond found cancer STAs more likely to be ‘no’
• =1 if the STA process was used
 Orphan
• Evaluate […] ‘orphan drugs’, in the same way as any other treatment
(SVJ, 2008; Littlejohns and Rawlins, 2009)
• Hypothesised to have no impact based on NICE statements
• =1 if the treatment has EMEA orphan status
 Uncertainty
• Difference between the highest and lowest NE quadrant ICERs
• 2 dummies indicating whether the plausible ICER could be dominant,
or be dominated were explored, but dropped out of regression
• Other measures of uncertainty are problematic

on a subset of decisions
 End of life
• Place special value on treatments prolonging life at the end of life,
providing that life is of reasonable quality (Rawlins, 2010; NICE 2009)
• =1 if met EoL criteria
• Only evaluated for decisions with preliminary guidance (FAD)
published after 5th Jan 2009

HTAinSite
 Most data derived from HTAinSite: www.htainsite.com
 Commercial database developed by OHE, Abacus and City
University
 Provides extensive data on
all NICE and SMC appraisals
• Regularly updated
• Extracted and validated based
on established protocol
 Access to web interface
available to subscribers
 Academics can request
data for research by email

Appraisals versus decisions
 We analyse binary yes/no choices (not yes/no/yes, but)
• Evidence, ICERs and other considerations often differ by subgroup for
which the technology is recommended and those for which it is
rejected
• Levels of restrictions differ enormously from 5% of patients to 80%
(O’Neill and Devlin 2010)
• Reflects HTAinSite protocol
 NICE appraisals are divided into yes/no decisions concerning
whether or not to use one technology in one patient
subgroup with a certain condition
• Methods for subdividing appraisals governed by HTAinSite protocol

Collection of ICER data
 Most guidance documents give multiple ICERs
• From manufacturer, assessment group and decision support unit
• Base case, subgroup analyses and sensitivity analyses
• Several comparators
 HTAinSite records all ICERs mentioned in documentation
 We developed a protocol to identify the ICERs informing each
decision
• Included only cost/QALY ICERs for the subgroup(s) in that decision
• DSU or ERG/TAG ICERs used in preference to manufacturer
• Vs NICE’s preferred comparator or next most effective treatment on the
frontier
• Exclude ICERs that NICE did not believe (based on considerations section)

Decisions with multiple ICERs
 A decision may have >1 relevant ICER if
• It covers >1 subgroup with different ICERs
• Results of several analyses or comparators are given equal prominence
in guidance document
• NICE concluded the ICER was ‘between X and Y’, <A or >B
 Taking the mean or midpoint would ignore uncertainty & make
assumptions about how NICE uses ICER data
 We therefore randomly sampled from the list of ICERs
• Each ICER was given equal weight
• For ranges, we sampled from the full list of ICERs from other decisions
• Drew 100 iterations, each with different ICERs for each decision
• Analyses repeated on each iteration; results combined with Rubin’s
Rules

Outline of econometric methods
 Used logistic regression to predict the effect of ICER and
other variables on the log-odds of NICE saying ‘yes’
 Adjusted standard errors to allow for clustering of decisions
within appraisals
 Analysed in Stata version 12

Modelling strategy
Stage 0: Estimation of ICER-only model where ICER alone
predicts recommendations
Stage 1: Evaluate a 'basic model' with the variables expected to
have most impact on NICE decisions
Stage 2: Remove non-significant variables from basic model one
at a time
Stage 3: Evaluate impact of adding additional variables
Stage 4: Alternative specifications of basic model parameters [to
come]
Stage 5: Sensitivity and subgroup analyses [to come]

Model selection
 Our methods for dealing with decisions with ≥2 ICERs
involve generating 100 datasets with different ICER values
 We combine coefficients across datasets using Rubin’s Rules
 AIC and pseudo-R2 cannot be pooled across datasets
 We therefore choose between models based on prediction
accuracy
• We assume that model predicts a ‘yes’ if predicted log-odds ≥0
• Categorise decisions into true/false positives and true/false negatives
to get the % of decisions correctly predicted

on a subset of decisions
 End of life
• Place special value on treatments prolonging life at the end of life,
providing that life is of reasonable quality (Rawlins 2010; NICE 2009)
• =1 if met EoL criteria
• Only evaluated for decisions with preliminary guidance (FAD)
published after 5th Jan 2009

RESULTS OF EXPLORATORY
DATA ANALYSIS

Numbers of decisions and appraisals
240 appraisals
published by 31st
December 2011 E1 & E2: 11 terminated
appraisals & 12 decisions
without other restrictions
excluded
229 appraisals
comprising 763
decisions
E3a: 162 decisions based on
grounds other than cost-
effectiveness
E3b-c: 92 decisions based on
cost-effectiveness but without
available, quantified cost/QALY
510 decisions included
in models with ICERs

Number of ICERs
510 decisions have available quantified cost/QALY ICERs of which:
 198 have 2 to 40 ICERs
 31 have ICER range
ICERs for these appraisals
are randomly drawn in
100 datasets

ICER data
NE: more
costly,
more
effective
SE: less
costly,
more
effective
NW:
more
costly, less
effective
SW: less
costly, less
effective
Quadrants
vary across
decisions
Number
decisions
All 418 33 31 6 22
Yes (%) 282 (67%) 33 (100%) 0 (0%) 5 (83%) 13 (59%)
No (%) 136 (33%) 0 (0%) 31 (100%) 1 (17%) 9 (41%)
Mean
ICER
All £34,207 Dominant Dominated £5,760 -
Yes £17,450 Dominant - £5,544 -
No £68,952 - Dominated £6,839 -
 Average ICER is >3 times higher for ‘no’ decisions than ‘yes’
 Dominance perfectly predicts NICE recommendations
 Subsequent analyses focus on the NE quadrant decisions

Impact of ICER ranking on
recommendations
 Decisions with high ICERs are more likely to be rejected, but
there are many exceptions
£0
£45,500
£2,500
£5,000
£7,500
£60,000
£70,000
£100,000
£500,000
£10,000
£12,500
£15,000
£17,500
£20,000
£22,500
£25,000
£27,500
£30,000
£32,500
£35,000
£37,500
£40,000
£45,000
£50,000
Blue = recommended; red = rejected

Proportion of decisions below different
thresholds that are rejected
 50% of decisions with ICERs >£20,000 are rejected

Sensitivity and specificity at different
thresholds
 ROC analysis suggests ICER strongly predicts decisions (AUC 0.85)
 Sensitivity and specificity both equal 77% if Rc ~£30,000/QALY
 % correctly classified plateaus at 81-82% between Rc 36k & £54k,
peaking at £47,743/QALY
Specificity, sensitivity and classification calculated using roctab

Plotting probability of rejection against
ICER
Inflection points at ~£20,000 and
~£50,000/QALY?
Rawlins & Culyer estimated
that the relationship was of
this shape and that 'inflexion A
occurs at around £5,000-
£15,000/QALY and inflexion B
at around £25,000-
£35,000/QALY'
Decisions are grouped into categories with similar ICER;
proportion of decisions in each category that were recommended
plotted against mid-point of each category
Curved line shows approximate best fit by eye

Results of the basic model (1)
Variable Definition Odds ratio (SE)
ICER Cost-effectiveness ratio (£’000s) 0.938 (0.915, 0.962)*
Total Pts in RCTs
Total number of patients randomised in
all RCTs for this decision
0.99999 (0.99996, 1.00002)
Only treatment =1 if there are no alternative treatments 2.263 (0.448, 11.448)
Children =1 if concerns treatments for children 3.774 (0.274, 52.026)
Patient group
submission
=1 if ≥1 patient group(s) made a
submission
0.929 (0.097, 8.912)
Publication date
Years since first NICE appraisal was
published
1.061 (0.947, 1.188)
Severity
Mean DALY weight for conditions in this
disease category
0.435 (0.031, 6.022)
* p<0.05
 Every £1000 increase in ICER reduces odds of yes by 6.2%
 No other variables are statistically significant

 At average levels for all covariates, a decision would have a
50% chance of rejection if its ICER were £45,118/QALY

 Based on a 50% cut-off, 81.63% of decisions are correctly
classified
• However, at this cut-off, sensitivity (94% of ‘yes’ decisions correctly
predicted) is higher than specificity (56% of ‘no’ decisions correct)
Actual
recommendation
No Yes Total
Predicted
recommendation
No 79 (56%) 17 (6%) 96
Yes 61 (44%) 271 (94%) 332
Total 140 288 428

Impact of removing variables from
basic model
 Removing any variable except severity worsens prediction
accuracy compared with basic model (81.63%)
Variable removed % correctly classified after
removing variable
Total Patients in RCTs 81.53%
Only treatment 81.25%
Children 81.61%
Patient group submission 81.63%
Publication date 81.62%
Severity 81.84%
All variables except ICER 81.44%

Effect of adding variables to basic model
Variable Definition % correct Odds ratio
Pharmaceutical = 1 for drugs 81.62% 0.903 (p=0.81)
PSA = 1 if the model has PSA 81.78% 0.672 (p=0.45)
Broader
perspective
= 1 if non-NHS/PSS costs were analysed or
discussed
81.57% 0.666 (p=0.46)
STA = 1 if the STA process was used 81.74% 0.659 (p=0.23)
Orphan = 1 if Tx has EMEA orphan status 81.94% 1.415 (p=0.55)
Range of ICERs
Difference between the lowest and highest
NE quadrant ICER for this decision in £’000s
82.56% 0.987 (p=0.15)
Cancer
Dummy =1 if decision is for this disease
82.29% 2.025 (p=0.10)
Cardiovascular 81.70% 0.869 (p=0.75)
Central nervous
system
81.41% 0.433 (p=0.30)
Endocrine 81.58% 0.648 (p=0.45)
Infectious 81.69% 1.122 (p=0.89)
Mental health 81.50% 0.411 (p=0.49)
Musculoskeletal 81.99% 3.317 (p=0.02)
Respiratory 82.55% 0.855 (p=0.002)

Model including all variables increasing
prediction accuracy
 Omitted severity and added STA, PSA, orphan, ICER range,
cancer, cardiovascular disease, infectious disease,
musculoskeletal & respiratory to basic model as these
improved model fit
 Correctly classified 84.20% of NICE decisions
• Specificity was higher than basic model, but sensitivity was lower
Actual recommendation
No Yes Total
Predicted
recommendation
No 25 (66%) 6 (8%) 32
Yes 13 (34%) 80 (92%) 93
Total 39 86 125

Comparison of model predictions
Allowing for other factors has little impact on curve or threshold
ICER only: £45,449
Basic model: £45,118
Basic model minus severity, plus
STA, PSA, orphan, ICER range,
cancer, CVD, infection,
musculoskeletal & respiratory :
£41,808

Comparing thresholds across diseases
 NICE decisions and thresholds appear to vary substantially
across diseases

End of life
 The impact of end of life was evaluated for the 133 decisions
with draft guidance since Jan 2009
 Adding end-of-life variable to basic model improves prediction
accuracy for these decisions from 84.23% to 85.12%
• More than any other variable explored except ICER range and
respiratory
 Odds of NICE saying ‘yes’ are 3.37 (95% CI: 0.64, 17.86)
times higher if meets end-of-life criteria (p=0.153)

End of life (2)
 Threshold appears to be higher (>£50,000) post-2009
 End-of-life treatments have higher ICER
Not end of life: £53,534
End of life: £67,646

Conclusions (1)
 ICER is by far the strongest predictor of NICE decisions
• Excluding those decisions based on clinical grounds or lack of evidence
 ICER alone explains 82% of NICE decisions
 Other variables significantly affecting NICE decisions include
• Whether for respiratory disorder (less chance of ‘yes’)
• Whether for musculoskeletal disorder (more chance of ‘yes’)
 Variables improving predictions, but not statistically significant
• End of life – matches EoL guidance;
• PSA; orphan; uncertainty
• Cardiovascular disease, cancer, infection
• Committee; innovation

Conclusions (2)
 Odds of a ‘yes’ decrease by ~6% for every £1000 increase in
ICER
 50% of decisions with ICERs >£20,000/QALY are ‘no’
 Specificity and sensitivity are equalised at £30,000/QALY
threshold
 Our ‘best’ model suggests that the average decision with an
ICER of £42,000 has a 50% chance of being rejected

Next steps
 Conduct sensitivity and subgroup analyses to explore how
results vary with alternative specification of models and
variables
• Further exploration of variations over time, e.g. subgrouping appraisals
by time periods
• Cross-validation to be conducted on the best model
 Step function model, assuming that NICE rejects all
treatments above a certain threshold ICER?
• Additional variables increase/decrease the threshold, not the log-odds
 Multi-part models of decision-making

Multi-part models of decision-making
 Current analyses exclude decisions not based on cost-
effectiveness grounds
 Some of the variables (e.g. clinical evidence or only treatment)
may predict these decisions
 Decisions to reject/recommend on other grounds may occur
before ICER evidence is considered
 Could explore these earlier steps in 2- or 3-part models
Rejected on clinical grounds
Recommended on clinical grounds
Consider cost-effectiveness
Rejected based on lack of clinical evidence

Discussion points
 Which threshold is correct?
• ICER at which there is a 50% chance of rejection?
• ICER that maximises specificity or sensitivity?
• ICER above which there is a 50% chance of rejection?
 Are the multi-part models realistic? Which is best?
 Is there any way that we could better measure
innovation, severity and/or uncertainty?
 Is the % of decisions correctly classified the best way
to select models?
• Can we pool AIC across datasets?
 Is it reasonable to select models on prediction
accuracy without a validation sample?
 How should we present the data?

Acknowledgments
We would like to thank:
 A consortium of 12 companies that provided a
research grant to facilitate the initial data collection
and modelling
 HTAinSite and (in particular) Carmel Guarnieri and
Zoe Philips for providing the data used in this
analysis
 Members of HERC, ScHARR and HESG for their
comments on our earlier work

To enquire about additional information and analyses, please contact
Nancy Devlin (ndevlin@ohe.org) or Helen Dakin
(helen.dakin@dph.ox.ac.uk)
To keep up with the latest news and research, subscribe to our blog, OHE News.
Follow us on Twitter @OHENews, LinkedIn and SlideShare.
Office of Health Economics (OHE)
Southside, 7th Floor
105 Victoria Street
London SW1E 6QT
United Kingdom
+44 20 7747 8850
www.ohe.org
OHE’s publications may be downloaded free of charge for registered users of its website.
©2013 OHE

Factors Influencing Decisions by NICE

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Factors Influencing Decisions by NICE

Similaire à Factors Influencing Decisions by NICE (20)

Plus de Office of Health Economics

Plus de Office of Health Economics (20)

Dernier

Dernier (20)

Factors Influencing Decisions by NICE

Notes de l'éditeur