Why Teams call analytics are critical to your entire business
Using Machine Learning to Automate Clinical Pathways
1. Using
Machine
Learning
to
Automate
Clinical
Pathways
David
Sontag,
PhD
Department
of
Computer
Science
Courant
Ins@tute
of
Mathema@cal
Sciences
NYU
Joint
work
with
my
student
Yoni
Halpern
(NYU)
and
Steven
Horng
(Beth
Israel
Deaconess
Medical
Center)
2. Health
Informa@on
Technology
is
Rapidly
Changing
• Aided
by
HITECH
Act,
hospital
adop@on
of
EHRs
has
increased
5-‐fold
since
2008
[Charles
et
al.,
ONC
Data
Brief,
May
2014]
3. • Over
$4
billion
of
investment
in
digital
health
startups
in
2014
Health
Informa@on
Technology
is
Rapidly
Changing
Analy@cs
/
Big
Data
Healthcare
Consumer
Engagement
[Wang
et
al.,
“Digital
health
funding
in
Q1
2015
over
$600M”,
Rock
Health,
April
2015]
EHR
/
Clinical
Workflow
Digital
Diagnos@cs
Popula@on
Health
Management
Digital
Medical
Device
4. [Weber
et
al.
(2014).
Finding
the
Missing
Link
for
Big
Biomedical
Data.
JAMA.]
Wealth
of
digital
health
data
available
5. Research
in
my
clinical
ML
lab
• Next-‐genera*on
electronic
health
records
focus
of
today’s
talk
• Popula@on-‐level
risk
stra@fica@on
• Beber
managing
pa@ents
with
chronic
disease
clinicalml.org
7. Triage
Informa@on
(Free
text)
Lab
results
(Con@nuous
valued)
MD
comments
(free
text)
Specialist
consults
Physician
documenta@on
Repeated
vital
signs
(con@nuous
values)
Measured
every
30
s
T=0
30
min
2
hrs
Disposi@on
Next-Generation EHR for the
Emergency Department
8. All
pa*ent
observa*ons
MD/nurse
documenta@on
Billing
codes
Vitals
Orders
Labs
History
Built
on
Top
of
Real-‐@me
Predic@on
of
Clinical
State
Variables
9. All
pa*ent
observa*ons
Clinical
state
variables
MD/nurse
documenta@on
Billing
codes
Vitals
Orders
Labs
History
From
nursing
home?
Has
altered
mental
status?
Has
cardiac
e@ology?
Has
infec@on?
Will
die
in
next
30
days?
Built
on
Top
of
Real-‐@me
Predic@on
of
Clinical
State
Variables
Machine
learning
and
natural
language
processing
10. All
pa*ent
observa*ons
Clinical
state
variables
MD/nurse
documenta@on
Billing
codes
Vitals
Orders
Labs
History
Ac*on
Alerts/
Reminders
Decision
support
Cohort
Selec@on
QA
review
Contextual
display
From
nursing
home?
Has
altered
mental
status?
Has
cardiac
e@ology?
Has
infec@on?
Will
die
in
next
30
days?
Built
on
Top
of
Real-‐@me
Predic@on
of
Clinical
State
Variables
Machine
learning
and
natural
language
processing
11. All
pa*ent
observa*ons
Clinical
state
variables
MD/nurse
documenta@on
Billing
codes
Vitals
Orders
Labs
History
Ac*on
Alerts/
Reminders
Decision
support
Cohort
Selec@on
QA
review
Contextual
display
From
nursing
home?
Has
altered
mental
status?
Has
cardiac
e@ology?
Has
infec@on?
Will
die
in
next
30
days?
Built
on
Top
of
Real-‐@me
Predic@on
of
Clinical
State
Variables
Machine
learning
and
natural
language
processing
Advise
fall
precau@ons
Suggested
order
sets
Triggering
celluli@s
pathway
Sepsis
alert
Panel
management
12. Example:
Triggering
Clinical
Pathways
• Clinical
Pathways
project
at
Beth
Israel
Deaconess
Medical
Center
(BIDMC)
• Standardizing
care
in
the
Emergency
Department
– Reduce
possibili@es
for
error
– Enforce
established
best
prac@ces
• Pathways
have
been
shown
to
reduce
in-‐hospital
complica@ons,
without
increasing
costs
[Rober
et
al
2010]
14. Automa@ng
triggers
• Don’t
rely
on
the
user’s
knowledge
that
the
pathway
exists!
15. Current
triggering
mechanism
(Celluli@s
pathway)
Trigger
if
chief
complaint
contains
any
of
the
following:
CELLULITIS,
REDDENED
HOT
LIMB,
ERYTHEMA,
LEG
SWELLING,
INFECTION,
HAND,
LEG,
FOOT,
TOE,
ARM,
FACE,
FINGER
16. Current
triggering
mechanism
(Celluli@s
pathway)
Trigger
if
chief
complaint
contains
any
of
the
following:
CELLULITIS,
REDDENED
HOT
LIMB,
ERYTHEMA,
LEG
SWELLING,
INFECTION,
HAND,
LEG,
FOOT,
TOE,
ARM,
FACE,
FINGER
Expert
constructed
rule
–
built
for
sensi*vity
Could
we
learn
a
beber
rule?
17. Supervised
learning
is
a
non-‐starter
• Leverage
large
clinical
databases
to
learn
predic@ve
rules.
• Need
labeled
data
• Classifiers
onen
don’t
generalize
across
ins@tu@ons
LOINC& UMLS&CUID& RXnorm& ICD9& Unstructured&Data&
18. Our
contribu@on:
Anchor
&
Learn
Framework
• Use
a
combina@on
of
domain
exper@se
(simple
rules)
and
vast
amounts
of
data
(machine
learning).
• Method
does
not
require
any
manual
labeling.
• Anchors
are
highly
transferable
between
ins@tu@ons.
[Halpern
et
al.,
AMIA
2014]
19. What
are
anchors?
• Rather
than
provide
gold-‐standard
labels,
construct
a
simple
rule
that
can
catch
some
posi@ve
cases.
20. What
are
anchors?
• Rather
than
provide
gold-‐standard
labels,
construct
a
simple
rule
that
can
catch
some
posi@ve
cases.
• Examples:
Phenotype
Possible
Anchor
Diabe@c
gsn:016313
(insulin)
in
Medica@ons
Cardiac
ICD9:428.X
(heart
failure)
in
Diagnoses
Nursing
home
“from
nursing
home”
in
text
Social
work
“social
work
consulted”
in
text
21. What
are
anchors?
• Rather
than
provide
gold-‐standard
labels,
construct
a
simple
rule
that
can
catch
some
posi@ve
cases.
Low
sensi*vity
here
is
ok!
• Examples:
Phenotype
Possible
Anchor
Diabe@c
gsn:016313
(insulin)
in
Medica@ons
Cardiac
ICD9:428.X
(heart
failure)
in
Diagnoses
Nursing
home
“from
nursing
home”
in
text
Social
work
“social
work
consulted”
in
text
27. LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&
New
ins@tu@on
Generalizability/Portability
Data
may
be
very
different:
• Language
• Representa@on
• Popula@on
29. New
ins@tu@on
Generalizability/Portability
As
long
as
our
anchors
appear
in
the
new
data
as
well…
Can
learn
a
new
model,
specific
to
the
new
ins@tu@on.
LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&
30. New
ins@tu@on
Generalizability/Portability
As
long
as
our
anchors
appear
in
the
new
data
as
well…
Can
learn
a
new
model,
specific
to
the
new
ins@tu@on.
Only
need
to
share
anchor
defini*ons,
Each
site
trains
models
on
its
own
data.
LOINC& UMLS&CUID& RXnorm& ICD9& Different&data&types&
31. Theore@cal
basis
for
anchors
• Unobserved
variable:
Y,
Observa@on:
A
• A
is
an
anchor
for
Y
if
condi@oning
on
A=1
gives
uniform
samples
from
the
set
of
posi8ve
cases.
32. Theore@cal
basis
for
anchors
• Unobserved
variable:
Y,
Observa@on:
A
• A
is
an
anchor
for
Y
if
condi@oning
on
A=1
gives
uniform
samples
from
the
set
of
posi8ve
cases.
• Alterna@ve
formula@on
–
two
necessary
condi@ons:
P(Y = 1|A = 1) = 1
Posi*ve
condi*on
A ? X|Y
Condi*onal
independence
AND
X represents
all
other
observa@ons.
33. Theore@cal
basis
for
anchors
• Unobserved
variable:
Y,
Observa@on:
A
• A
is
an
anchor
for
Y
if
condi@oning
on
A=1
gives
uniform
samples
from
the
set
of
posi8ve
cases.
• Alterna@ve
formula@on
–
two
necessary
condi@ons:
P(Y = 1|A = 1) = 1
Posi*ve
condi*on
A ? X|Y
Condi*onal
independence
AND
X represents
all
other
observa@ons.
e.g.
If
pa@ent
is
taking
insulin,
the
pa@ent
is
surely
diabe*c.
e.g.
If
we
know
the
pa@ent
had
heart
failure,
knowing
whether
the
diagnosis
code
appears
does
inform
us
about
the
rest
of
the
record.
34. Theore@cal
basis
for
anchors
• Unobserved
variable:
Y,
Observa@on:
A
• A
is
an
anchor
for
Y
if
condi@oning
on
A=1
gives
uniform
samples
from
the
set
of
posi8ve
cases.
• Theorem
[Elkan
&
Noto
2008]:
In
the
above
se>ng,
a
func8on
to
predict
A
can
be
transformed
to
predict
Y
• Can
also
use
more
recent
advances
on
learning
with
noisy
labels
(e.g.,
Natarajan
et
al.,
NIPS
‘13)
35. Learning
with
anchors
Input:
anchor
A
unlabeled
pa@ents
Output:
predic@on
rule
1. Learn
a
calibrated
classifier
(e.g.
logis@c
regression)
to
predict:
2. Using
a
validate
set,
let
P
be
the
pa@ents
with
A=1.
Compute:
3. For
a
previously
unseen
pa@ent
t,
predict:
Pr(A = 1 | ˜X)
C =
1
|P|
X
k2P
Pr(A = 1 | ˜X(k)
)
[Elkan
&
Noto
2008]
1
C
Pr(A = 1|X(t)
) if A(t)
= 0
1 if A(t)
= 1
Calibra*on
C
is
the
average
model
predic@on
for
pa@ents
with
anchors.
Learning
Learn
to
predict
A
from
the
other
variables.
Transforma*on
If
no
anchor
present,
according
to
a
scaled
version
of
the
anchor-‐predic@on
model.
36. …
…
Specified
anchors
Automated
sugges@ons
Detailed
pa@ent
display
Ranked
pa@ent
list
Pa@ent
filters
User
interface
to
specify
anchors
Rapid
itera*on
~30
min
to
add
a
new
clinical
state
variable
Sonware
freely
available:
clinicalml.org
37. Learned
model:
Celluli@s
Pyxis
Unstructured
text
Anchors
Highly
weighted
features
(covariates)
ICD9
680-‐686:
Infec*ons
of
skin
and
subcutaneous
*ssue
celluli*s
celluli*c
cellulits
paronychia
pilonidal
bite
cyst
boil
abcess
abscess
abcesses
red
redness
reddness
erythema
unasyn
vanco
finger
thumb
rle
lle
gluteal
cephalexin
vancomycin
clindamycin
cephazolin
amoxicillin
sulfameth/trimeth
(using
200K
pa@ents’
data,
2008-‐2013)
39. Learned
model:
Nursing
Home
nursing
facility
nursing
home
nsg
facility
nsg
home
nsg.
home
from
staff
at
resident
sent
reported
Ages
age=90+
age=80-‐90
age=70-‐80
baseline
changes
nonverbal
ams
unwitnessed_fall
confusion
senna
colace
trazodone
dnr
full
code
g
tube
foley
nh
Medica*ons
vancomycin
levofloxacin
Pyxis
Unstructured
text
Anchors
mirtazapine
maalox
tums
Highly
weighted
features
(covariates)
(using
200K
pa@ents’
data,
2008-‐2013)
40. Evalua@on:
ED
red
flags
• Ac@ve
malignancy
• Fall
• Cardiac
E@ology
• Infec@on
• From
Nursing
Home
• An@coagulated
• Immunosuppressed
• Sep@c
Shock
• Pneumonia
We
gathered
gold
standard
labels
for
these
9
variables
by
adding
ques@ons
to
EMR
at
@me
of
ED
disposi@on:
41. Comparison
to
Exis@ng
Approaches
• (Rules)
Predict
just
according
to
the
anchors.
– 1
if
anchor
is
present,
0
otherwise
• (ML)
Machine
learning
(logis@c
regression)
– Using
up
to
3K
labels
– Improves
with
more
labels,
but
labels
are
expensive!
45. Scaling
this
up
• Currently
making
predic@ons
for
40
clinical
variables
within
the
BIDMC
pa*ent
display
– e.g.
allergic
reac@on,
motor
vehicle
accident,
hiv+
• Only
turned
on
for
a
small
number
of
clinicians
Suggested
tags:
MD
can
accept/reject
46. Scaling
this
up
• Currently
making
predic@ons
for
40
clinical
variables
within
the
BIDMC
pa*ent
display
– e.g.
allergic
reac@on,
motor
vehicle
accident,
hiv+
Accep@ng
a
tag
triggers
events
(pathway
enrollment,
specialized
order
sets,
etc)
47. Our
next
steps
• Shared
library
of
anchored
phenotypes
• Real-‐@me
es@ma@on
of
clinical
states
and
actual
use
for
decision
support
within
ED
• Test
portability
of
anchors
to
other
ins@tu@ons
More
info:
clinicalml.org