글로벌 디지털 헬스케어 산업 및 규제 동향

글로벌 디지털 헬스케어 산업 및 규제 동향

대표파트너, 디지털 헬스케어 파트너스

소장, 디지털 헬스케어 연구소

최윤섭, Ph.D.

https://rockhealth.com/reports/2018-year-end-funding-report-is-digital-health-in-a-bubble/
•2018년에는 $8.1B 가 투자되며 역대 최대 규모를 또 한 번 갱신 (전년 대비 42.% 증가)

•총 368개의 딜 (전년 359 대비 소폭 증가): 개별 딜의 규모가 커졌음

•전체 딜의 절반이 seed 혹은 series A 투자였음

•‘초기 기업들이 역대 최고로 큰 규모의 투자를’, ‘역대 가장 자주’ 받고 있음

2010 2011 2012 2013 2014 2015 2016 2017 2018
Q1 Q2 Q3 Q4
153
283
476
647
608
568
684
851
765
FUNDING SNAPSHOT: YEAR OVER YEAR
5
Deal Count
$1.4B
$1.7B
$1.7B
$627M
$603M$459M
$8.2B
$6.2B
$7.1B
$2.9B
$2.3B$2.0B
$1.2B
$11.7B
$2.3B
Funding surpassed 2017 numbers by almost $3B, making 2018 the fourth consecutive increase in capital investment and
largest since we began tracking digital health funding in 2010. Deal volume decreased from Q3 to Q4, but deal sizes spiked,
with $3B invested in Q4 alone. Average deal size in 2018 was $21M, a $6M increase from 2017.
$3.0B
$14.6B
DEALS & FUNDING INVESTORS SEGMENT DETAIL
Source: StartUp Health Insights | startuphealth.com/insights Note: Report based on public data through 12/31/18 on seed (incl. accelerator), venture, corporate venture, and private equity funding only. © 2019 StartUp Health LLC
•글로벌 투자 추이를 보더라도, 2018년 역대 최대 규모: $14.6B

•2015년 이후 4년 연속 증가 중
https://hq.startuphealth.com/posts/startup-healths-2018-insights-funding-report-a-record-year-for-digital-health

27
Switzerland
EUROPE
$3.2B
$1.96B $1B
$3.5B
NORTH AMERICA
$12B Valuation
$1.8B
$3.1B$3.2B
$1B
$1B
38 healthcare unicorns valued at $90.7B
Global VC-backed digital health companies with a private market valuation of $1B+ (7/26/19)
UNITED KINGDOM
$1.5B
MIDDLE EAST
$1B Valuation
ISRAEL
$7B
$1B$1.2B
$1B
$1.65B
$1.8B
$1.25B
$2.8B
$1B $1B
$2B Valuation
$1.5B
UNITED STATES
GERMANY
$1.7B
$2.5B
CHINA
ASIA
$3B
$5.5B Valuation
$5B
$2.4B
$2.4B
France
$1.1B $3.5B
$1.6B
$1B
$1B
$1B
$1B
CB Insights, Global Healthcare Reports 2019 2Q
•전 세계적으로 38개의 디지털 헬스케어 유니콘 스타트업 (=기업가치 $1B 이상) 이 있으나,

•국내에는 하나도 없음

헬스케어
넓은 의미의 건강 관리에는 해당되지만,

디지털 기술이 적용되지 않고, 전문 의료 영역도 아닌 것

예) 운동, 영양, 수면
디지털 헬스케어
건강 관리 중에 디지털 기술이 사용되는 것

예) 사물인터넷, 인공지능, 3D 프린터, VR/AR
모바일 헬스케어
디지털 헬스케어 중

모바일 기술이 사용되는 것

예) 스마트폰, 사물인터넷, SNS
개인 유전정보분석
암유전체, 질병위험도,

보인자, 약물 민감도
예) 웰니스, 조상 분석
헬스케어 관련 분야 구성도(ver 0.6)
의료
질병 예방, 치료, 처방, 관리

등 전문 의료 영역
원격의료
원격 환자 모니터링
원격진료
전화, 화상, 판독
디지털 치료제
당뇨 예방 앱
중독 치료 앱

ADHD 치료게임

EDITORIAL OPEN
Digital medicine, on its way to being just plain medicine
npj Digital Medicine (2018)1:20175 ; doi:10.1038/
s41746-017-0005-1
There are already nearly 30,000 peer-reviewed English-language
scientific journals, producing an estimated 2.5 million articles a year.1
So why another, and why one focused specifically on digital
medicine?
To answer that question, we need to begin by defining what
“digital medicine” means: using digital tools to upgrade the
practice of medicine to one that is high-definition and far more
individualized. It encompasses our ability to digitize human beings
using biosensors that track our complex physiologic systems, but
also the means to process the vast data generated via algorithms,
cloud computing, and artificial intelligence. It has the potential to
democratize medicine, with smartphones as the hub, enabling
each individual to generate their own real world data and being
far more engaged with their health. Add to this new imaging
tools, mobile device laboratory capabilities, end-to-end digital
clinical trials, telemedicine, and one can see there is a remarkable
array of transformative technology which lays the groundwork for
a new form of healthcare.
As is obvious by its definition, the far-reaching scope of digital
medicine straddles many and widely varied expertise. Computer
scientists, healthcare providers, engineers, behavioral scientists,
ethicists, clinical researchers, and epidemiologists are just some of
the backgrounds necessary to move the field forward. But to truly
accelerate the development of digital medicine solutions in health
requires the collaborative and thoughtful interaction between
individuals from several, if not most of these specialties. That is the
primary goal of npj Digital Medicine: to serve as a cross-cutting
resource for everyone interested in this area, fostering collabora-
tions and accelerating its advancement.
Current systems of healthcare face multiple insurmountable
challenges. Patients are not receiving the kind of care they want
and need, caregivers are dissatisfied with their role, and in most
countries, especially the United States, the cost of care is
unsustainable. We are confident that the development of new
systems of care that take full advantage of the many capabilities
that digital innovations bring can address all of these major issues.
Researchers too, can take advantage of these leading-edge
technologies as they enable clinical research to break free of the
confines of the academic medical center and be brought into the
real world of participants’ lives. The continuous capture of multiple
interconnected streams of data will allow for a much deeper
refinement of our understanding and definition of most pheno-
types, with the discovery of novel signals in these enormous data
sets made possible only through the use of machine learning.
Our enthusiasm for the future of digital medicine is tempered by
the recognition that presently too much of the publicized work in
this field is characterized by irrational exuberance and excessive
hype. Many technologies have yet to be formally studied in a
clinical setting, and for those that have, too many began and
ended with an under-powered pilot program. In addition, there are
more than a few examples of digital “snake oil” with substantial
uptake prior to their eventual discrediting.2
Both of these practices
are barriers to advancing the field of digital medicine.
Our vision for npj Digital Medicine is to provide a reliable,
evidence-based forum for all clinicians, researchers, and even
patients, curious about how digital technologies can transform
every aspect of health management and care. Being open source,
as all medical research should be, allows for the broadest possible
dissemination, which we will strongly encourage, including
through advocating for the publication of preprints
And finally, quite paradoxically, we hope that npj Digital
Medicine is so successful that in the coming years there will no
longer be a need for this journal, or any journal specifically
focused on digital medicine. Because if we are able to meet our
primary goal of accelerating the advancement of digital medicine,
then soon, we will just be calling it medicine. And there are
already several excellent journals for that.
ACKNOWLEDGEMENTS
Supported by the National Institutes of Health (NIH)/National Center for Advancing
Translational Sciences grant UL1TR001114 and a grant from the Qualcomm Foundation.
ADDITIONAL INFORMATION
Competing interests:The authors declare no competing financial interests.
Publisher's note:Springer Nature remains neutral with regard to jurisdictional claims
in published maps and institutional affiliations.
Change history:The original version of this Article had an incorrect Article number
of 5 and an incorrect Publication year of 2017. These errors have now been corrected
in the PDF and HTML versions of the Article.
Steven R. Steinhubl1
and Eric J. Topol1
1
Scripps Translational Science Institute, 3344 North Torrey Pines
Court, Suite 300, La Jolla, CA 92037, USA
Correspondence: Steven R. Steinhubl (steinhub@scripps.edu) or
Eric J. Topol (etopol@scripps.edu)
REFERENCES
1. Ware, M. & Mabe, M. The STM report: an overview of scientific and scholarly journal
publishing 2015 [updated March]. http://digitalcommons.unl.edu/scholcom/92017
(2015).
2. Plante, T. B., Urrea, B. & MacFarlane, Z. T. et al. Validation of the instant blood
pressure smartphone App. JAMA Intern. Med. 176, 700–702 (2016).
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license, and indicate if changes were made. The images or other third party
material in this article are included in the article’s Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the
article’s Creative Commons license and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this license, visit http://creativecommons.
org/licenses/by/4.0/.
© The Author(s) 2018
Received: 19 October 2017 Accepted: 25 October 2017
www.nature.com/npjdigitalmed
Published in partnership with the Scripps Translational Science Institute
디지털 의료의 미래는?

일상적인 의료가 되는 것

SaMD, SaMD, SaMD
쌤디 혹은 쌈디

SaMD
의료기기
엑스레이 기기

혈압계
디지털 헬스케어
핏빗

런키퍼

슬립싸이클
디지털

치료제
페어 알킬리
엠페티카
얼라이브코어
프로테우스
*SaMD: Software as a Medical Device
시그널
영상의학
병리
안과
피부
인공지능
SaMD*
의료  
영상
복잡한
데이터

No choice but to bring AI into the medicine

LETTER https://doi.org/10.1038/s41586-019-1390-1
A clinically applicable approach to continuous
prediction of future acute kidney injury
Nenad Tomašev1
*, Xavier Glorot1
, Jack W. Rae1,2
, Michal Zielinski1
, Harry Askham1
, Andre Saraiva1
, Anne Mottram1
,
Clemens Meyer1
, Suman Ravuri1
, Ivan Protsyuk1
, Alistair Connell1
, Cían O. Hughes1
, Alan Karthikesalingam1
,
Julien Cornebise1,12
, Hugh Montgomery3
, Geraint Rees4
, Chris Laing5
, Clifton R. Baker6
, Kelly Peterson7,8
, Ruth Reeves9
,
Demis Hassabis1
, Dominic King1
, Mustafa Suleyman1
, Trevor Back1,13
, Christopher Nielson10,11,13
, Joseph R. Ledsam1,13
* &
Shakir Mohamed1,13
The early prediction of deterioration could have an important role
in supporting healthcare professionals, as an estimated 11% of
deaths in hospital follow a failure to promptly recognize and treat
deteriorating patients1
. To achieve this goal requires predictions
of patient risk that are continuously updated and accurate, and
delivered at an individual level with sufficient context and enough
time to act. Here we develop a deep learning approach for the
continuous risk prediction of future deterioration in patients,
building on recent work that models adverse events from electronic
health records2–17
and using acute kidney injury—a common and
potentially life-threatening condition18
—as an exemplar. Our
model was developed on a large, longitudinal dataset of electronic
health records that cover diverse clinical environments, comprising
703,782 adult patients across 172 inpatient and 1,062 outpatient
sites. Our model predicts 55.8% of all inpatient episodes of acute
kidney injury, and 90.2% of all acute kidney injuries that required
subsequent administration of dialysis, with a lead time of up to
48 h and a ratio of 2 false alerts for every true alert. In addition
to predicting future acute kidney injury, our model provides
confidence assessments and a list of the clinical features that are most
salient to each prediction, alongside predicted future trajectories
for clinically relevant blood tests9
. Although the recognition and
prompt treatment of acute kidney injury is known to be challenging,
our approach may offer opportunities for identifying patients at risk
within a time window that enables early treatment.
Adverse events and clinical complications are a major cause of mor-
tality and poor outcomes in patients, and substantial effort has been
made to improve their recognition18,19
. Few predictors have found their
way into routine clinical practice, because they either lack effective
sensitivity and specificity or report damage that already exists20
. One
example relates to acute kidney injury (AKI), a potentially life-threat-
ening condition that affects approximately one in five inpatient admis-
sions in the United States21
. Although a substantial proportion of cases
of AKI are thought to be preventable with early treatment22
, current
algorithms for detecting AKI depend on changes in serum creatinine
as a marker of acute decline in renal function. Increases in serum cre-
atinine lag behind renal injury by a considerable period, which results
in delayed access to treatment. This supports a case for preventative
‘screening’-type alerts but there is no evidence that current rule-based
alerts improve outcomes23
. For predictive alerts to be effective, they
must empower clinicians to act before a major clinical decline has
occurred by: (i) delivering actionable insights on preventable condi-
tions; (ii) being personalized for specific patients; (iii) offering suffi-
cient contextual information to inform clinical decision-making; and
(iv) being generally applicable across populations of patients24
.
Promising recent work on modelling adverse events from electronic
health records2–17
suggests that the incorporation of machine learning
may enable the early prediction of AKI. Existing examples of sequential
AKI risk models have either not demonstrated a clinically applicable
level of predictive performance25
or have focused on predictions across
a short time horizon that leaves little time for clinical assessment and
intervention26
.
Our proposed system is a recurrent neural network that operates
sequentially over individual electronic health records, processing the
data one step at a time and building an internal memory that keeps
track of relevant information seen up to that point. At each time point,
the model outputs a probability of AKI occurring at any stage of sever-
ity within the next 48 h (although our approach can be extended to
other time windows or severities of AKI; see Extended Data Table 1).
When the predicted probability exceeds a specified operating-point
threshold, the prediction is considered positive. This model was trained
using data that were curated from a multi-site retrospective dataset of
703,782 adult patients from all available sites at the US Department of
Veterans Affairs—the largest integrated healthcare system in the United
States. The dataset consisted of information that was available from
hospital electronic health records in digital format. The total number of
independent entries in the dataset was approximately 6 billion, includ-
ing 620,000 features. Patients were randomized across training (80%),
validation (5%), calibration (5%) and test (10%) sets. A ground-truth
label for the presence of AKI at any given point in time was added
using the internationally accepted ‘Kidney Disease: Improving Global
Outcomes’ (KDIGO) criteria18
; the incidence of KDIGO AKI was
13.4% of admissions. Detailed descriptions of the model and dataset
are provided in the Methods and Extended Data Figs. 1–3.
Figure 1 shows the use of our model. At every point throughout an
admission, the model provides updated estimates of future AKI risk
along with an associated degree of uncertainty. Providing the uncer-
tainty associated with a prediction may help clinicians to distinguish
ambiguous cases from those predictions that are fully supported by the
available data. Identifying an increased risk of future AKI sufficiently
far in advance is critical, as longer lead times may enable preventative
action to be taken. This is possible even when clinicians may not be
actively intervening with, or monitoring, a patient. Supplementary
Information section A provides more examples of the use of the model.
With our approach, 55.8% of inpatient AKI events of any severity
were predicted early, within a window of up to 48 h in advance and with
a ratio of 2 false predictions for every true positive. This corresponds
to an area under the receiver operating characteristic curve of 92.1%,
and an area under the precision–recall curve of 29.7%. When set at this
threshold, our predictive model would—if operationalized—trigger a
1
DeepMind, London, UK. 2
CoMPLEX, Computer Science, University College London, London, UK. 3
Institute for Human Health and Performance, University College London, London, UK. 4
Institute of
Cognitive Neuroscience, University College London, London, UK. 5
University College London Hospitals, London, UK. 6
Department of Veterans Affairs, Denver, CO, USA. 7
VA Salt Lake City Healthcare
System, Salt Lake City, UT, USA. 8
Division of Epidemiology, University of Utah, Salt Lake City, UT, USA. 9
Department of Veterans Affairs, Nashville, TN, USA. 10
University of Nevada School of
Medicine, Reno, NV, USA. 11
Department of Veterans Affairs, Salt Lake City, UT, USA. 12
Present address: University College London, London, UK. 13
These authors contributed equally: Trevor Back,
Christopher Nielson, Joseph R. Ledsam, Shakir Mohamed. *e-mail: nenadt@google.com; jledsam@google.com
1 1 6 | N A T U R E | V O L 5 7 2 | 1 A U G U S T 2 0 1 9
Copyright 2016 American Medical Association. All rights reserved.
Development and Validation of a Deep Learning Algorithm
for Detection of Diabetic Retinopathy
in Retinal Fundus Photographs
Varun Gulshan, PhD; Lily Peng, MD, PhD; Marc Coram, PhD; Martin C. Stumpe, PhD; Derek Wu, BS; Arunachalam Narayanaswamy, PhD;
Subhashini Venugopalan, MS; Kasumi Widner, MS; Tom Madams, MEng; Jorge Cuadros, OD, PhD; Ramasamy Kim, OD, DNB;
Rajiv Raman, MS, DNB; Philip C. Nelson, BS; Jessica L. Mega, MD, MPH; Dale R. Webster, PhD
IMPORTANCE Deep learning is a family of computational methods that allow an algorithm to
program itself by learning from a large set of examples that demonstrate the desired
behavior, removing the need to specify rules explicitly. Application of these methods to
medical imaging requires further assessment and validation.
OBJECTIVE To apply deep learning to create an algorithm for automated detection of diabetic
retinopathy and diabetic macular edema in retinal fundus photographs.
DESIGN AND SETTING A specific type of neural network optimized for image classification
called a deep convolutional neural network was trained using a retrospective development
data set of 128 175 retinal images, which were graded 3 to 7 times for diabetic retinopathy,
diabetic macular edema, and image gradability by a panel of 54 US licensed ophthalmologists
and ophthalmology senior residents between May and December 2015. The resultant
algorithm was validated in January and February 2016 using 2 separate data sets, both
graded by at least 7 US board-certified ophthalmologists with high intragrader consistency.
EXPOSURE Deep learning–trained algorithm.
MAIN OUTCOMES AND MEASURES The sensitivity and specificity of the algorithm for detecting
referable diabetic retinopathy (RDR), defined as moderate and worse diabetic retinopathy,
referable diabetic macular edema, or both, were generated based on the reference standard
of the majority decision of the ophthalmologist panel. The algorithm was evaluated at 2
operating points selected from the development set, one selected for high specificity and
another for high sensitivity.
RESULTS TheEyePACS-1datasetconsistedof9963imagesfrom4997patients(meanage,54.4
years;62.2%women;prevalenceofRDR,683/8878fullygradableimages[7.8%]);the
Messidor-2datasethad1748imagesfrom874patients(meanage,57.6years;42.6%women;
prevalenceofRDR,254/1745fullygradableimages[14.6%]).FordetectingRDR,thealgorithm
hadanareaunderthereceiveroperatingcurveof0.991(95%CI,0.988-0.993)forEyePACS-1and
0.990(95%CI,0.986-0.995)forMessidor-2.Usingthefirstoperatingcutpointwithhigh
specificity,forEyePACS-1,thesensitivitywas90.3%(95%CI,87.5%-92.7%)andthespecificity
was98.1%(95%CI,97.8%-98.5%).ForMessidor-2,thesensitivitywas87.0%(95%CI,81.1%-
91.0%)andthespecificitywas98.5%(95%CI,97.7%-99.1%).Usingasecondoperatingpoint
withhighsensitivityinthedevelopmentset,forEyePACS-1thesensitivitywas97.5%and
specificitywas93.4%andforMessidor-2thesensitivitywas96.1%andspecificitywas93.9%.
CONCLUSIONS AND RELEVANCE In this evaluation of retinal fundus photographs from adults
with diabetes, an algorithm based on deep machine learning had high sensitivity and
specificity for detecting referable diabetic retinopathy. Further research is necessary to
determine the feasibility of applying this algorithm in the clinical setting and to determine
whether use of the algorithm could lead to improved care and outcomes compared with
current ophthalmologic assessment.
JAMA. doi:10.1001/jama.2016.17216
Published online November 29, 2016.
Editorial
Supplemental content
Author Affiliations: Google Inc,
Mountain View, California (Gulshan,
Peng, Coram, Stumpe, Wu,
Narayanaswamy, Venugopalan,
Widner, Madams, Nelson, Webster);
Department of Computer Science,
University of Texas, Austin
(Venugopalan); EyePACS LLC,
San Jose, California (Cuadros); School
of Optometry, Vision Science
Graduate Group, University of
California, Berkeley (Cuadros);
Aravind Medical Research
Foundation, Aravind Eye Care
System, Madurai, India (Kim); Shri
Bhagwan Mahavir Vitreoretinal
Services, Sankara Nethralaya,
Chennai, Tamil Nadu, India (Raman);
Verily Life Sciences, Mountain View,
California (Mega); Cardiovascular
Division, Department of Medicine,
Brigham and Women’s Hospital and
Harvard Medical School, Boston,
Massachusetts (Mega).
Corresponding Author: Lily Peng,
MD, PhD, Google Research, 1600
Amphitheatre Way, Mountain View,
CA 94043 (lhpeng@google.com).
Research
JAMA | Original Investigation | INNOVATIONS IN HEALTH CARE DELIVERY
(Reprinted) E1
Copyright 2016 American Medical Association. All rights reserved.
Downloaded From: http://jamanetwork.com/ on 12/02/2016
안과
LETTERS
https://doi.org/10.1038/s41591-018-0335-9
1
Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, China. 2
Institute for Genomic Medicine, Institute of
Engineering in Medicine, and Shiley Eye Institute, University of California, San Diego, La Jolla, CA, USA. 3
Hangzhou YITU Healthcare Technology Co. Ltd,
Hangzhou, China. 4
Department of Thoracic Surgery/Oncology, First Affiliated Hospital of Guangzhou Medical University, China State Key Laboratory and
National Clinical Research Center for Respiratory Disease, Guangzhou, China. 5
Guangzhou Kangrui Co. Ltd, Guangzhou, China. 6
Guangzhou Regenerative
Medicine and Health Guangdong Laboratory, Guangzhou, China. 7
Veterans Administration Healthcare System, San Diego, CA, USA. 8
These authors contributed
equally: Huiying Liang, Brian Tsui, Hao Ni, Carolina C. S. Valentim, Sally L. Baxter, Guangjian Liu. *e-mail: kang.zhang@gmail.com; xiahumin@hotmail.com
Artificial intelligence (AI)-based methods have emerged as
powerful tools to transform medical care. Although machine
learning classifiers (MLCs) have already demonstrated strong
performance in image-based diagnoses, analysis of diverse
and massive electronic health record (EHR) data remains chal-
lenging. Here, we show that MLCs can query EHRs in a manner
similar to the hypothetico-deductive reasoning used by physi-
cians and unearth associations that previous statistical meth-
ods have not found. Our model applies an automated natural
language processing system using deep learning techniques
to extract clinically relevant information from EHRs. In total,
101.6 million data points from 1,362,559 pediatric patient
visits presenting to a major referral center were analyzed to
train and validate the framework. Our model demonstrates
high diagnostic accuracy across multiple organ systems and is
comparable to experienced pediatricians in diagnosing com-
mon childhood diseases. Our study provides a proof of con-
cept for implementing an AI-based system as a means to aid
physicians in tackling large amounts of data, augmenting diag-
nostic evaluations, and to provide clinical decision support in
cases of diagnostic uncertainty or complexity. Although this
impact may be most evident in areas where healthcare provid-
ers are in relative shortage, the benefits of such an AI system
are likely to be universal.
Medical information has become increasingly complex over
time. The range of disease entities, diagnostic testing and biomark-
ers, and treatment modalities has increased exponentially in recent
years. Subsequently, clinical decision-making has also become more
complex and demands the synthesis of decisions from assessment
of large volumes of data representing clinical information. In the
current digital age, the electronic health record (EHR) represents a
massive repository of electronic data points representing a diverse
array of clinical information1–3
. Artificial intelligence (AI) methods
have emerged as potentially powerful tools to mine EHR data to aid
in disease diagnosis and management, mimicking and perhaps even
augmenting the clinical decision-making of human physicians1
.
To formulate a diagnosis for any given patient, physicians fre-
quently use hypotheticodeductive reasoning. Starting with the chief
complaint, the physician then asks appropriately targeted questions
relating to that complaint. From this initial small feature set, the
physician forms a differential diagnosis and decides what features
(historical questions, physical exam findings, laboratory testing,
and/or imaging studies) to obtain next in order to rule in or rule
out the diagnoses in the differential diagnosis set. The most use-
ful features are identified, such that when the probability of one of
the diagnoses reaches a predetermined level of acceptability, the
process is stopped, and the diagnosis is accepted. It may be pos-
sible to achieve an acceptable level of certainty of the diagnosis with
only a few features without having to process the entire feature set.
Therefore, the physician can be considered a classifier of sorts.
In this study, we designed an AI-based system using machine
learning to extract clinically relevant features from EHR notes to
mimic the clinical reasoning of human physicians. In medicine,
machine learning methods have already demonstrated strong per-
formance in image-based diagnoses, notably in radiology2
, derma-
tology4
, and ophthalmology5–8
, but analysis of EHR data presents
a number of difficult challenges. These challenges include the vast
quantity of data, high dimensionality, data sparsity, and deviations
Evaluation and accurate diagnoses of pediatric
diseases using artificial intelligence
Huiying Liang1,8
, Brian Y. Tsui 2,8
, Hao Ni3,8
, Carolina C. S. Valentim4,8
, Sally L. Baxter 2,8
,
Guangjian Liu1,8
, Wenjia Cai 2
, Daniel S. Kermany1,2
, Xin Sun1
, Jiancong Chen2
, Liya He1
, Jie Zhu1
,
Pin Tian2
, Hua Shao2
, Lianghong Zheng5,6
, Rui Hou5,6
, Sierra Hewett1,2
, Gen Li1,2
, Ping Liang3
,
Xuan Zang3
, Zhiqi Zhang3
, Liyan Pan1
, Huimin Cai5,6
, Rujuan Ling1
, Shuhua Li1
, Yongwang Cui1
,
Shusheng Tang1
, Hong Ye1
, Xiaoyan Huang1
, Waner He1
, Wenqing Liang1
, Qing Zhang1
, Jianmin Jiang1
,
Wei Yu1
, Jianqun Gao1
, Wanxing Ou1
, Yingmin Deng1
, Qiaozhen Hou1
, Bei Wang1
, Cuichan Yao1
,
Yan Liang1
, Shu Zhang1
, Yaou Duan2
, Runze Zhang2
, Sarah Gibson2
, Charlotte L. Zhang2
, Oulan Li2
,
Edward D. Zhang2
, Gabriel Karin2
, Nathan Nguyen2
, Xiaokang Wu1,2
, Cindy Wen2
, Jie Xu2
, Wenqin Xu2
,
Bochu Wang2
, Winston Wang2
, Jing Li1,2
, Bianca Pizzato2
, Caroline Bao2
, Daoman Xiang1
, Wanting He1,2
,
Suiqin He2
, Yugui Zhou1,2
, Weldon Haw2,7
, Michael Goldbaum2
, Adriana Tremoulet2
, Chun-Nan Hsu 2
,
Hannah Carter2
, Long Zhu3
, Kang Zhang 1,2,7
* and Huimin Xia 1
*
NATURE MEDICINE | www.nature.com/naturemedicine
소아청소년
ARTICLES
https://doi.org/10.1038/s41591-018-0177-5
1
Applied Bioinformatics Laboratories, New York University School of Medicine, New York, NY, USA. 2
Skirball Institute, Department of Cell Biology,
New York University School of Medicine, New York, NY, USA. 3
Department of Pathology, New York University School of Medicine, New York, NY, USA.
4
School of Mechanical Engineering, National Technical University of Athens, Zografou, Greece. 5
Institute for Systems Genetics, New York University School
of Medicine, New York, NY, USA. 6
Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, NY,
USA. 7
Center for Biospecimen Research and Development, New York University, New York, NY, USA. 8
Department of Population Health and the Center for
Healthcare Innovation and Delivery Science, New York University School of Medicine, New York, NY, USA. 9
These authors contributed equally to this work:
Nicolas Coudray, Paolo Santiago Ocampo. *e-mail: narges.razavian@nyumc.org; aristotelis.tsirigos@nyumc.org
A
ccording to the American Cancer Society and the Cancer
Statistics Center (see URLs), over 150,000 patients with lung
cancer succumb to the disease each year (154,050 expected
for 2018), while another 200,000 new cases are diagnosed on a
yearly basis (234,030 expected for 2018). It is one of the most widely
spread cancers in the world because of not only smoking, but also
exposure to toxic chemicals like radon, asbestos and arsenic. LUAD
and LUSC are the two most prevalent types of non–small cell lung
cancer1
, and each is associated with discrete treatment guidelines. In
the absence of definitive histologic features, this important distinc-
tion can be challenging and time-consuming, and requires confir-
matory immunohistochemical stains.
Classification of lung cancer type is a key diagnostic process
because the available treatment options, including conventional
chemotherapy and, more recently, targeted therapies, differ for
LUAD and LUSC2
. Also, a LUAD diagnosis will prompt the search
for molecular biomarkers and sensitizing mutations and thus has
a great impact on treatment options3,4
. For example, epidermal
growth factor receptor (EGFR) mutations, present in about 20% of
LUAD, and anaplastic lymphoma receptor tyrosine kinase (ALK)
rearrangements, present in<5% of LUAD5
, currently have tar-
geted therapies approved by the Food and Drug Administration
(FDA)6,7
. Mutations in other genes, such as KRAS and tumor pro-
tein P53 (TP53) are very common (about 25% and 50%, respec-
tively) but have proven to be particularly challenging drug targets
so far5,8
. Lung biopsies are typically used to diagnose lung cancer
type and stage. Virtual microscopy of stained images of tissues is
typically acquired at magnifications of 20×to 40×, generating very
large two-dimensional images (10,000 to>100,000 pixels in each
dimension) that are oftentimes challenging to visually inspect in
an exhaustive manner. Furthermore, accurate interpretation can be
difficult, and the distinction between LUAD and LUSC is not always
clear, particularly in poorly differentiated tumors; in this case, ancil-
lary studies are recommended for accurate classification9,10
. To assist
experts, automatic analysis of lung cancer whole-slide images has
been recently studied to predict survival outcomes11
and classifica-
tion12
. For the latter, Yu et al.12
combined conventional thresholding
and image processing techniques with machine-learning methods,
such as random forest classifiers, support vector machines (SVM) or
Naive Bayes classifiers, achieving an AUC of ~0.85 in distinguishing
normal from tumor slides, and ~0.75 in distinguishing LUAD from
LUSC slides. More recently, deep learning was used for the classi-
fication of breast, bladder and lung tumors, achieving an AUC of
0.83 in classification of lung tumor types on tumor slides from The
Cancer Genome Atlas (TCGA)13
. Analysis of plasma DNA values
was also shown to be a good predictor of the presence of non–small
cell cancer, with an AUC of ~0.94 (ref. 14
) in distinguishing LUAD
from LUSC, whereas the use of immunochemical markers yields an
AUC of ~0.94115
.
Here, we demonstrate how the field can further benefit from deep
learning by presenting a strategy based on convolutional neural
networks (CNNs) that not only outperforms methods in previously
Classification and mutation prediction from
non–small cell lung cancer histopathology
images using deep learning
Nicolas Coudray 1,2,9
, Paolo Santiago Ocampo3,9
, Theodore Sakellaropoulos4
, Navneet Narula3
,
Matija Snuderl3
, David Fenyö5,6
, Andre L. Moreira3,7
, Narges Razavian 8
* and Aristotelis Tsirigos 1,3
*
Visual inspection of histopathology slides is one of the main methods used by pathologists to assess the stage, type and sub-
type of lung tumors. Adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) are the most prevalent subtypes of lung
cancer, and their distinction requires visual inspection by an experienced pathologist. In this study, we trained a deep con-
volutional neural network (inception v3) on whole-slide images obtained from The Cancer Genome Atlas to accurately and
automatically classify them into LUAD, LUSC or normal lung tissue. The performance of our method is comparable to that of
pathologists, with an average area under the curve (AUC) of 0.97. Our model was validated on independent datasets of frozen
tissues, formalin-fixed paraffin-embedded tissues and biopsies. Furthermore, we trained the network to predict the ten most
commonly mutated genes in LUAD. We found that six of them—STK11, EGFR, FAT1, SETBP1, KRAS and TP53—can be pre-
dicted from pathology images, with AUCs from 0.733 to 0.856 as measured on a held-out population. These findings suggest
that deep-learning models can assist pathologists in the detection of cancer subtype or gene mutations. Our approach can be
applied to any cancer type, and the code is available at https://github.com/ncoudray/DeepPATH.
병리과병리과병리과병리과병리과병리과병리과
ARTICLES
https://doi.org/10.1038/s41551-018-0301-3
1
Sichuan Academy of Medical Sciences & Sichuan Provincial People’s Hospital, Chengdu, China. 2
Shanghai Wision AI Co., Ltd, Shanghai, China. 3
Beth Israel
Deaconess Medical Center and Harvard Medical School, Center for Advanced Endoscopy, Boston , MA, USA. *e-mail: gary.samsph@gmail.com
C
olonoscopy is the gold-standard screening test for colorectal
cancer1–3
, one of the leading causes of cancer death in both the
United States4,5
and China6
. Colonoscopy can reduce the risk
of death from colorectal cancer through the detection of tumours
at an earlier, more treatable stage as well as through the removal of
precancerous adenomas3,7
. Conversely, failure to detect adenomas
may lead to the development of interval cancer. Evidence has shown
that each 1.0% increase in adenoma detection rate (ADR) leads to a
3.0% decrease in the risk of interval colorectal cancer8
.
Although more than 14million colonoscopies are performed
in the United States annually2
, the adenoma miss rate (AMR) is
estimated to be 6–27%9
. Certain polyps may be missed more fre-
quently, including smaller polyps10,11
, flat polyps12
and polyps in the
left colon13
. There are two independent reasons why a polyp may
be missed during colonoscopy: (i) it was never in the visual field or
(ii) it was in the visual field but not recognized. Several hardware
innovations have sought to address the first problem by improv-
ing visualization of the colonic lumen, for instance by providing a
larger, panoramic camera view, or by flattening colonic folds using a
distal-cap attachment. The problem of unrecognized polyps within
the visual field has been more difficult to address14
. Several studies
have shown that observation of the video monitor by either nurses
or gastroenterology trainees may increase polyp detection by up
to 30%15–17
. Ideally, a real-time automatic polyp-detection system
could serve as a similarly effective second observer that could draw
the endoscopist’s eye, in real time, to concerning lesions, effec-
tively creating an ‘extra set of eyes’ on all aspects of the video data
with fidelity. Although automatic polyp detection in colonoscopy
videos has been an active research topic for the past 20 years, per-
formance levels close to that of the expert endoscopist18–20
have not
been achieved. Early work in automatic polyp detection has focused
on applying deep-learning techniques to polyp detection, but most
published works are small in scale, with small development and/or
training validation sets19,20
.
Here, we report the development and validation of a deep-learn-
ing algorithm, integrated with a multi-threaded processing system,
for the automatic detection of polyps during colonoscopy. We vali-
dated the system in two image studies and two video studies. Each
study contained two independent validation datasets.
Results
We developed a deep-learning algorithm using 5,545colonoscopy
images from colonoscopy reports of 1,290patients that underwent
a colonoscopy examination in the Endoscopy Center of Sichuan
Provincial People’s Hospital between January 2007 and December
2015. Out of the 5,545images used, 3,634images contained polyps
(65.54%) and 1,911 images did not contain polyps (34.46%). For
algorithm training, experienced endoscopists annotated the pres-
ence of each polyp in all of the images in the development data-
set. We validated the algorithm on four independent datasets.
DatasetsA and B were used for image analysis, and datasetsC and D
were used for video analysis.
DatasetA contained 27,113colonoscopy images from colo-
noscopy reports of 1,138consecutive patients who underwent a
colonoscopy examination in the Endoscopy Center of Sichuan
Provincial People’s Hospital between January and December 2016
and who were found to have at least one polyp. Out of the 27,113
images, 5,541images contained polyps (20.44%) and 21,572images
did not contain polyps (79.56%). All polyps were confirmed histo-
logically after biopsy. DatasetB is a public database (CVC-ClinicDB;
Development and validation of a deep-learning
algorithm for the detection of polyps during
colonoscopy
Pu Wang1
, Xiao Xiao2
, Jeremy R. Glissen Brown3
, Tyler M. Berzin 3
, Mengtian Tu1
, Fei Xiong1
,
Xiao Hu1
, Peixi Liu1
, Yan Song1
, Di Zhang1
, Xue Yang1
, Liangping Li1
, Jiong He2
, Xin Yi2
, Jingjia Liu2
and
Xiaogang Liu 1
*
The detection and removal of precancerous polyps via colonoscopy is the gold standard for the prevention of colon cancer.
However, the detection rate of adenomatous polyps can vary significantly among endoscopists. Here, we show that a machine-
learningalgorithmcandetectpolypsinclinicalcolonoscopies,inrealtimeandwithhighsensitivityandspecificity.Wedeveloped
the deep-learning algorithm by using data from 1,290 patients, and validated it on newly collected 27,113 colonoscopy images
from 1,138 patients with at least one detected polyp (per-image-sensitivity, 94.38%; per-image-specificity, 95.92%; area under
the receiver operating characteristic curve, 0.984), on a public database of 612 polyp-containing images (per-image-sensitiv-
ity, 88.24%), on 138 colonoscopy videos with histologically confirmed polyps (per-image-sensitivity of 91.64%; per-polyp-sen-
sitivity, 100%), and on 54 unaltered full-range colonoscopy videos without polyps (per-image-specificity, 95.40%). By using a
multi-threaded processing system, the algorithm can process at least 25 frames per second with a latency of 76.80±5.60ms
in real-time video analysis. The software may aid endoscopists while performing colonoscopies, and help assess differences in
polyp and adenoma detection performance among endoscopists.
NATURE BIOMEDICA L ENGINEERING | VOL 2 | OCTOBER 2018 | 741–748 | www.nature.com/natbiomedeng 741
소화기내과
1Wang P, et al. Gut 2019;0:1–7. doi:10.1136/gutjnl-2018-317500
Endoscopy
ORIGINAL ARTICLE
Real-time automatic detection system increases
colonoscopic polyp and adenoma detection rates: a
prospective randomised controlled study
Pu Wang, 1
Tyler M Berzin, 2
Jeremy Romek Glissen Brown, 2
Shishira Bharadwaj,2
Aymeric Becq,2
Xun Xiao,1
Peixi Liu,1
Liangping Li,1
Yan Song,1
Di Zhang,1
Yi Li,1
Guangre Xu,1
Mengtian Tu,1
Xiaogang Liu 1
To cite: Wang P, Berzin TM,
Glissen Brown JR, et al. Gut
Epub ahead of print: [please
include Day Month Year].
doi:10.1136/
gutjnl-2018-317500
► Additional material is
published online only.To view
please visit the journal online
(http://dx.doi.org/10.1136/
gutjnl-2018-317500).
1
Department of
Gastroenterology, Sichuan
Academy of Medical Sciences
& Sichuan Provincial People’s
Hospital, Chengdu, China
2
Center for Advanced
Endoscopy, Beth Israel
Deaconess Medical Center and
Massachusetts, USA
Correspondence to
Xiaogang Liu, Department
of Gastroenterology Sichuan
and Sichuan Provincial People’s
Hospital, Chengdu, China;
Gary.samsph@gmail.com
Received 30 August 2018
Revised 4 February 2019
Accepted 13 February 2019
© Author(s) (or their
employer(s)) 2019. Re-use
permitted under CC BY-NC. No
commercial re-use. See rights
and permissions. Published
by BMJ.
ABSTRACT
Objective The effect of colonoscopy on colorectal
cancer mortality is limited by several factors, among them
a certain miss rate, leading to limited adenoma detection
rates (ADRs).We investigated the effect of an automatic
polyp detection system based on deep learning on polyp
detection rate and ADR.
Design In an open, non-blinded trial, consecutive
patients were prospectively randomised to undergo
diagnostic colonoscopy with or without assistance of a
real-time automatic polyp detection system providing
a simultaneous visual notice and sound alarm on polyp
detection.The primary outcome was ADR.
Results Of 1058 patients included, 536 were
randomised to standard colonoscopy, and 522 were
randomised to colonoscopy with computer-aided
diagnosis.The artificial intelligence (AI) system
significantly increased ADR (29.1%vs20.3%, p<0.001)
and the mean number of adenomas per patient
(0.53vs0.31, p<0.001).This was due to a higher number
of diminutive adenomas found (185vs102; p<0.001),
while there was no statistical difference in larger
adenomas (77vs58, p=0.075). In addition, the number
of hyperplastic polyps was also significantly increased
(114vs52, p<0.001).
Conclusions In a low prevalent ADR population, an
automatic polyp detection system during colonoscopy
resulted in a significant increase in the number of
diminutive adenomas detected, as well as an increase in
the rate of hyperplastic polyps.The cost–benefit ratio of
such effects has to be determined further.
Trial registration number ChiCTR-DDD-17012221;
Results.
INTRODUCTION
Colorectal cancer (CRC) is the second and third-
leading causes of cancer-related deaths in men and
women respectively.1
Colonoscopy is the gold stan-
dard for screening CRC.2 3
Screening colonoscopy
has allowed for a reduction in the incidence and
mortality of CRC via the detection and removal
of adenomatous polyps.4–8
Additionally, there is
evidence that with each 1.0% increase in adenoma
detection rate (ADR), there is an associated 3.0%
decrease in the risk of interval CRC.9 10
However,
polyps can be missed, with reported miss rates of
up to 27% due to both polyp and operator charac-
teristics.11 12
Unrecognised polyps within the visual field is
an important problem to address.11
Several studies
have shown that assistance by a second observer
increases the polyp detection rate (PDR), but such a
strategy remains controversial in terms of increasing
the ADR.13–15
Ideally, a real-time automatic polyp detec-
tion system, with performance close to that of
expert endoscopists, could assist the endosco-
pist in detecting lesions that might correspond to
adenomas in a more consistent and reliable way
Significance of this study
What is already known on this subject?
► Colorectal adenoma detection rate (ADR)
is regarded as a main quality indicator of
(screening) colonoscopy and has been shown
to correlate with interval cancers. Reducing
adenoma miss rates by increasing ADR has
been a goal of many studies focused on
imaging techniques and mechanical methods.
► Artificial intelligence has been recently
introduced for polyp and adenoma detection
as well as differentiation and has shown
promising results in preliminary studies.
What are the new findings?
► This represents the first prospective randomised
controlled trial examining an automatic polyp
detection during colonoscopy and shows an
increase of ADR by 50%, from 20% to 30%.
► This effect was mainly due to a higher rate of
small adenomas found.
► The detection rate of hyperplastic polyps was
also significantly increased.
How might it impact on clinical practice in the
foreseeable future?
► Automatic polyp and adenoma detection could
be the future of diagnostic colonoscopy in order
to achieve stable high adenoma detection rates.
► However, the effect on ultimate outcome is
still unclear, and further improvements such as
polyp differentiation have to be implemented.
on17March2019byguest.Protectedbycopyright.http://gut.bmj.com/Gut:firstpublishedas10.1136/gutjnl-2018-317500on27February2019.Downloadedfrom
소화기내과
Downloadedfromhttps://journals.lww.com/ajspbyBhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AWnYQp/IlQrHD3MyLIZIvnCFZVJ56DGsD590P5lh5KqE20T/dBX3x9CoM=on10/14/2018
Downloadedfromhttps://journals.lww.com/ajspbyBhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AWnYQp/IlQrHD3MyLIZIvnCFZVJ56DGsD590P5lh5KqE20T/dBX3x9CoM=on10/14/2018
Impact of Deep Learning Assistance on the
Histopathologic Review of Lymph Nodes for Metastatic
Breast Cancer
David F. Steiner, MD, PhD,* Robert MacDonald, PhD,* Yun Liu, PhD,* Peter Truszkowski, MD,*
Jason D. Hipp, MD, PhD, FCAP,* Christopher Gammage, MS,* Florence Thng, MS,†
Lily Peng, MD, PhD,* and Martin C. Stumpe, PhD*
Abstract: Advances in the quality of whole-slide images have set the
stage for the clinical use of digital images in anatomic pathology.
Along with advances in computer image analysis, this raises the
possibility for computer-assisted diagnostics in pathology to improve
histopathologic interpretation and clinical care. To evaluate the
potential impact of digital assistance on interpretation of digitized
slides, we conducted a multireader multicase study utilizing our deep
learning algorithm for the detection of breast cancer metastasis in
lymph nodes. Six pathologists reviewed 70 digitized slides from lymph
node sections in 2 reader modes, unassisted and assisted, with a wash-
out period between sessions. In the assisted mode, the deep learning
algorithm was used to identify and outline regions with high like-
lihood of containing tumor. Algorithm-assisted pathologists demon-
strated higher accuracy than either the algorithm or the pathologist
alone. In particular, algorithm assistance significantly increased the
sensitivity of detection for micrometastases (91% vs. 83%, P=0.02).
In addition, average review time per image was significantly shorter
with assistance than without assistance for both micrometastases (61
vs. 116 s, P=0.002) and negative images (111 vs. 137 s, P=0.018).
Lastly, pathologists were asked to provide a numeric score regarding
the difficulty of each image classification. On the basis of this score,
pathologists considered the image review of micrometastases to be
significantly easier when interpreted with assistance (P=0.0005).
Utilizing a proof of concept assistant tool, this study demonstrates the
potential of a deep learning algorithm to improve pathologist accu-
racy and efficiency in a digital pathology workflow.
Key Words: artificial intelligence, machine learning, digital pathology,
breast cancer, computer aided detection
(Am J Surg Pathol 2018;00:000–000)
The regulatory approval and gradual implementation of
whole-slide scanners has enabled the digitization of glass
slides for remote consults and archival purposes.1 Digitiza-
tion alone, however, does not necessarily improve the con-
sistency or efficiency of a pathologist’s primary workflow. In
fact, image review on a digital medium can be slightly
slower than on glass, especially for pathologists with limited
digital pathology experience.2 However, digital pathology
and image analysis tools have already demonstrated po-
tential benefits, including the potential to reduce inter-reader
variability in the evaluation of breast cancer HER2 status.3,4
Digitization also opens the door for assistive tools based on
Artificial Intelligence (AI) to improve efficiency and con-
sistency, decrease fatigue, and increase accuracy.5
Among AI technologies, deep learning has demon-
strated strong performance in many automated image-rec-
ognition applications.6–8 Recently, several deep learning–
based algorithms have been developed for the detection of
breast cancer metastases in lymph nodes as well as for other
applications in pathology.9,10 Initial findings suggest that
some algorithms can even exceed a pathologist’s sensitivity
for detecting individual cancer foci in digital images. How-
ever, this sensitivity gain comes at the cost of increased false
positives, potentially limiting the utility of such algorithms for
automated clinical use.11 In addition, deep learning algo-
rithms are inherently limited to the task for which they have
been specifically trained. While we have begun to understand
the strengths of these algorithms (such as exhaustive search)
and their weaknesses (sensitivity to poor optical focus, tumor
mimics; manuscript under review), the potential clinical util-
ity of such algorithms has not been thoroughly examined.
While an accurate algorithm alone will not necessarily aid
pathologists or improve clinical interpretation, these benefits
may be achieved through thoughtful and appropriate in-
tegration of algorithm predictions into the clinical workflow.8
From the *Google AI Healthcare; and †Verily Life Sciences, Mountain
View, CA.
D.F.S., R.M., and Y.L. are co-first authors (equal contribution).
Work done as part of the Google Brain Healthcare Technology Fellowship
(D.F.S. and P.T.).
Conflicts of Interest and Source of Funding: D.F.S., R.M., Y.L., P.T.,
J.D.H., C.G., F.T., L.P., M.C.S. are employees of Alphabet and have
Alphabet stock.
Correspondence: David F. Steiner, MD, PhD, Google AI Healthcare,
1600 Amphitheatre Way, Mountain View, CA 94043
(e-mail: davesteiner@google.com).
Supplemental Digital Content is available for this article. Direct URL citations
appear in the printed text and are provided in the HTML and PDF
versions of this article on the journal’s website, www.ajsp.com.
Copyright © 2018 The Author(s). Published by Wolters Kluwer Health,
Inc. This is an open-access article distributed under the terms of the
Creative Commons Attribution-Non Commercial-No Derivatives
License 4.0 (CCBY-NC-ND), where it is permissible to download and
share the work provided it is properly cited. The work cannot be
changed in any way or used commercially without permission from
the journal.
ORIGINAL ARTICLE
Am J Surg Pathol Volume 00, Number 00, ’’ 2018 www.ajsp.com | 1
병리과
S E P S I S
A targeted real-time early warning score (TREWScore)
for septic shock
Katharine E. Henry,1
David N. Hager,2
Peter J. Pronovost,3,4,5
Suchi Saria1,3,5,6
*
Sepsis is a leading cause of death in the United States, with mortality highest among patients who develop septic
shock. Early aggressive treatment decreases morbidity and mortality. Although automated screening tools can detect
patients currently experiencing severe sepsis and septic shock, none predict those at greatest risk of developing
shock. We analyzed routinely available physiological and laboratory data from intensive care unit patients and devel-
oped “TREWScore,” a targeted real-time early warning score that predicts which patients will develop septic shock.
TREWScore identified patients before the onset of septic shock with an area under the ROC (receiver operating
characteristic) curve (AUC) of 0.83 [95% confidence interval (CI), 0.81 to 0.85]. At a specificity of 0.67, TREWScore
achieved a sensitivity of 0.85 and identified patients a median of 28.2 [interquartile range (IQR), 10.6 to 94.2] hours
before onset. Of those identified, two-thirds were identified before any sepsis-related organ dysfunction. In compar-
ison, the Modified Early Warning Score, which has been used clinically for septic shock prediction, achieved a lower
AUC of 0.73 (95% CI, 0.71 to 0.76). A routine screening protocol based on the presence of two of the systemic inflam-
matory response syndrome criteria, suspicion of infection, and either hypotension or hyperlactatemia achieved a low-
er sensitivity of 0.74 at a comparable specificity of 0.64. Continuous sampling of data from the electronic health
records and calculation of TREWScore may allow clinicians to identify patients at risk for septic shock and provide
earlier interventions that would prevent or mitigate the associated morbidity and mortality.
INTRODUCTION
Seven hundred fifty thousand patients develop severe sepsis and septic
shock in the United States each year. More than half of them are
admitted to an intensive care unit (ICU), accounting for 10% of all
ICU admissions, 20 to 30% of hospital deaths, and $15.4 billion in an-
nual health care costs (1–3). Several studies have demonstrated that
morbidity, mortality, and length of stay are decreased when severe sep-
sis and septic shock are identified and treated early (4–8). In particular,
one study showed that mortality from septic shock increased by 7.6%
with every hour that treatment was delayed after the onset of hypo-
tension (9).
More recent studies comparing protocolized care, usual care, and
early goal-directed therapy (EGDT) for patients with septic shock sug-
gest that usual care is as effective as EGDT (10–12). Some have inter-
preted this to mean that usual care has improved over time and reflects
important aspects of EGDT, such as early antibiotics and early ag-
gressive fluid resuscitation (13). It is likely that continued early identi-
fication and treatment will further improve outcomes. However, the
best approach to managing patients at high risk of developing septic
shock before the onset of severe sepsis or shock has not been studied.
Methods that can identify ahead of time which patients will later expe-
rience septic shock are needed to further understand, study, and im-
prove outcomes in this population.
General-purpose illness severity scoring systems such as the Acute
Physiology and Chronic Health Evaluation (APACHE II), Simplified
Acute Physiology Score (SAPS II), SequentialOrgan Failure Assessment
(SOFA) scores, Modified Early Warning Score (MEWS), and Simple
Clinical Score (SCS) have been validated to assess illness severity and
risk of death among septic patients (14–17). Although these scores
are useful for predicting general deterioration or mortality, they typical-
ly cannot distinguish with high sensitivity and specificity which patients
are at highest risk of developing a specific acute condition.
The increased use of electronic health records (EHRs), which can be
queried in real time, has generated interest in automating tools that
identify patients at risk for septic shock (18–20). A number of “early
warning systems,” “track and trigger” initiatives, “listening applica-
tions,” and “sniffers” have been implemented to improve detection
andtimelinessof therapy forpatients with severe sepsis andseptic shock
(18, 20–23). Although these tools have been successful at detecting pa-
tients currently experiencing severe sepsis or septic shock, none predict
which patients are at highest risk of developing septic shock.
The adoption of the Affordable Care Act has added to the growing
excitement around predictive models derived from electronic health
data in a variety of applications (24), including discharge planning
(25), risk stratification (26, 27), and identification of acute adverse
events (28, 29). For septic shock in particular, promising work includes
that of predicting septic shock using high-fidelity physiological signals
collected directly from bedside monitors (30, 31), inferring relationships
between predictors of septic shock using Bayesian networks (32), and
using routine measurements for septic shock prediction (33–35). No
current prediction models that use only data routinely stored in the
EHR predict septic shock with high sensitivity and specificity many
hours before onset. Moreover, when learning predictive risk scores, cur-
rent methods (34, 36, 37) often have not accounted for the censoring
effects of clinical interventions on patient outcomes (38). For instance,
a patient with severe sepsis who received fluids and never developed
septic shock would be treated as a negative case, despite the possibility
that he or she might have developed septic shock in the absence of such
treatment and therefore could be considered a positive case up until the
1
Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.
2
Division of Pulmonary and Critical Care Medicine, Department of Medicine, School of
Medicine, Johns Hopkins University, Baltimore, MD 21205, USA. 3
Armstrong Institute for
Patient Safety and Quality, Johns Hopkins University, Baltimore, MD 21202, USA. 4
Department
of Anesthesiology and Critical Care Medicine, School of Medicine, Johns Hopkins University,
Baltimore, MD 21202, USA. 5
Department of Health Policy and Management, Bloomberg
School of Public Health, Johns Hopkins University, Baltimore, MD 21205, USA. 6
Department
of Applied Math and Statistics, Johns Hopkins University, Baltimore, MD 21218, USA.
*Corresponding author. E-mail: ssaria@cs.jhu.edu
R E S E A R C H A R T I C L E
www.ScienceTranslationalMedicine.org 5 August 2015 Vol 7 Issue 299 299ra122 1
onNovember3,2016http://stm.sciencemag.org/Downloadedfrom
감염내과
BRIEF COMMUNICATION OPEN
Digital biomarkers of cognitive function
Paul Dagum1
To identify digital biomarkers associated with cognitive function, we analyzed human–computer interaction from 7 days of
smartphone use in 27 subjects (ages 18–34) who received a gold standard neuropsychological assessment. For several
neuropsychological constructs (working memory, memory, executive function, language, and intelligence), we found a family of
digital biomarkers that predicted test scores with high correlations (p 10−4
). These preliminary results suggest that passive
measures from smartphone use could be a continuous ecological surrogate for laboratory-based neuropsychological assessment.
npj Digital Medicine (2018)1:10 ; doi:10.1038/s41746-018-0018-4
INTRODUCTION
By comparison to the functional metrics available in other
disciplines, conventional measures of neuropsychiatric disorders
have several challenges. First, they are obtrusive, requiring a
subject to break from their normal routine, dedicating time and
often travel. Second, they are not ecological and require subjects
to perform a task outside of the context of everyday behavior.
Third, they are episodic and provide sparse snapshots of a patient
only at the time of the assessment. Lastly, they are poorly scalable,
taxing limited resources including space and trained staff.
In seeking objective and ecological measures of cognition, we
attempted to develop a method to measure memory and
executive function not in the laboratory but in the moment,
day-to-day. We used human–computer interaction on smart-
phones to identify digital biomarkers that were correlated with
neuropsychological performance.
RESULTS
In 2014, 27 participants (ages 27.1 ± 4.4 years, education
14.1 ± 2.3 years, M:F 8:19) volunteered for neuropsychological
assessment and a test of the smartphone app. Smartphone
human–computer interaction data from the 7 days following
the neuropsychological assessment showed a range of correla-
tions with the cognitive scores. Table 1 shows the correlation
between each neurocognitive test and the cross-validated
predictions of the supervised kernel PCA constructed from
the biomarkers for that test. Figure 1 shows each participant
test score and the digital biomarker prediction for (a) digits
backward, (b) symbol digit modality, (c) animal fluency,
(d) Wechsler Memory Scale-3rd Edition (WMS-III) logical
memory (delayed free recall), (e) brief visuospatial memory test
(delayed free recall), and (f) Wechsler Adult Intelligence Scale-
4th Edition (WAIS-IV) block design. Construct validity of the
predictions was determined using pattern matching that
computed a correlation of 0.87 with p 10−59
between the
covariance matrix of the predictions and the covariance matrix
of the tests.
Table 1. Fourteen neurocognitive assessments covering five cognitive
domains and dexterity were performed by a neuropsychologist.
Shown are the group mean and standard deviation, range of score,
and the correlation between each test and the cross-validated
prediction constructed from the digital biomarkers for that test
Cognitive predictions
Mean (SD) Range R (predicted),
p-value
Working memory
Digits forward 10.9 (2.7) 7–15 0.71 ± 0.10, 10−4
Digits backward 8.3 (2.7) 4–14 0.75 ± 0.08, 10−5
Executive function
Trail A 23.0 (7.6) 12–39 0.70 ± 0.10, 10−4
Trail B 53.3 (13.1) 37–88 0.82 ± 0.06, 10−6
Symbol digit modality 55.8 (7.7) 43–67 0.70 ± 0.10, 10−4
Language
Animal fluency 22.5 (3.8) 15–30 0.67 ± 0.11, 10−4
FAS phonemic fluency 42 (7.1) 27–52 0.63 ± 0.12, 10−3
Dexterity
Grooved pegboard test
(dominant hand)
62.7 (6.7) 51–75 0.73 ± 0.09, 10−4
Memory
California verbal learning test
(delayed free recall)
14.1 (1.9) 9–16 0.62 ± 0.12, 10−3
WMS-III logical memory
29.4 (6.2) 18–42 0.81 ± 0.07, 10−6
Brief visuospatial memory test
10.2 (1.8) 5–12 0.77 ± 0.08, 10−5
Intelligence scale
WAIS-IV block design 46.1(12.8) 12–61 0.83 ± 0.06, 10−6
WAIS-IV matrix reasoning 22.1(3.3) 12–26 0.80 ± 0.07, 10−6
WAIS-IV vocabulary 40.6(4.0) 31–50 0.67 ± 0.11, 10−4
Received: 5 October 2017 Revised: 3 February 2018 Accepted: 7 February 2018
1
Mindstrong Health, 248 Homer Street, Palo Alto, CA 94301, USA
Correspondence: Paul Dagum (paul@mindstronghealth.com)
정신의학과
P R E C I S I O N M E D I C I N E
Identification of type 2 diabetes subgroups through
topological analysis of patient similarity
Li Li,1
Wei-Yi Cheng,1
Benjamin S. Glicksberg,1
Omri Gottesman,2
Ronald Tamler,3
Rong Chen,1
Erwin P. Bottinger,2
Joel T. Dudley1,4
*
Type 2 diabetes (T2D) is a heterogeneous complex disease affecting more than 29 million Americans alone with a
rising prevalence trending toward steady increases in the coming decades. Thus, there is a pressing clinical need to
improve early prevention and clinical management of T2D and its complications. Clinicians have understood that
patients who carry the T2D diagnosis have a variety of phenotypes and susceptibilities to diabetes-related compli-
cations. We used a precision medicine approach to characterize the complexity of T2D patient populations based
on high-dimensional electronic medical records (EMRs) and genotype data from 11,210 individuals. We successfully
identified three distinct subgroups of T2D from topology-based patient-patient networks. Subtype 1 was character-
ized by T2D complications diabetic nephropathy and diabetic retinopathy; subtype 2 was enriched for cancer ma-
lignancy and cardiovascular diseases; and subtype 3 was associated most strongly with cardiovascular diseases,
neurological diseases, allergies, and HIV infections. We performed a genetic association analysis of the emergent
T2D subtypes to identify subtype-specific genetic m
내분비내과
LETTER
Derma o og - eve c a ca on o k n cancer
w h deep neura ne work
피부과
FOCUS LETTERS
W
W
W
W
W
Ca d o og s eve a hy hm a de ec on and
c ass ca on n ambu a o y e ec oca d og ams
us ng a deep neu a ne wo k
M m
M
FOCUS LETTERS
심장내과
D p a n ng nab obu a m n and on o
human b a o y a n v o a on
산부인과
O G NA A
W on o On o og nd b e n e e men
e ommend on g eemen w h n e pe
mu d p n umo bo d
종양내과
D m
m
B D m OHCA
m Kw MD K H MD M H M K m MD
M M K m MD M M L m MD M K H K m
MD D MD D MD D R K C
MD D B H O MD D
D m Em M M H
K
D C C C M H
K
T w
A D C D m
M C C M H
G m w G R K
Tw w
C A K H MD D C
D m M C C M
H K G m w G
R K T E m
m @ m m
A
A m O OHCA m
m m w w
T m
m DCA
M T w m K OHCA w
A
CCEPTED
M
A
N
U
SCRIPT
응급의학과
신장내과

•복잡한 의료 데이터의 분석 및 insight 도출

•영상 의료/병리 데이터의 분석/판독

•연속 데이터의 모니터링 및 예방/예측
의료 인공지능의 세 유형

ARTICLE OPEN
Scalable and accurate deep learning with electronic health
records
Alvin Rajkomar 1,2
, Eyal Oren1
, Kai Chen1
, Andrew M. Dai1
, Nissan Hajaj1
, Michaela Hardt1
, Peter J. Liu1
, Xiaobing Liu1
, Jake Marcus1
,
Mimi Sun1
, Patrik Sundberg1
, Hector Yee1
, Kun Zhang1
, Yi Zhang1
, Gerardo Flores1
, Gavin E. Duggan1
, Jamie Irvine1
, Quoc Le1
,
Kurt Litsch1
, Alexander Mossin1
, Justin Tansuwan1
, De Wang1
, James Wexler1
, Jimbo Wilson1
, Dana Ludwig2
, Samuel L. Volchenboum3
,
Katherine Chou1
, Michael Pearson1
, Srinivasan Madabushi1
, Nigam H. Shah4
, Atul J. Butte2
, Michael D. Howell1
, Claire Cui1
,
Greg S. Corrado1
and Jeffrey Dean1
Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare
quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR
data, a labor-intensive process that discards the vast majority of information in each patient’s record. We propose a representation
of patients’ entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that
deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple
centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic
medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR
data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for
tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93–0.94), 30-day
unplanned readmission (AUROC 0.75–0.76), prolonged length of stay (AUROC 0.85–0.86), and all of a patient’s final discharge
diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases.
We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case
study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the
patient’s chart.
INTRODUCTION
The promise of digital medicine stems in part from the hope that,
by digitizing health data, we might more easily leverage computer
information systems to understand and improve care. In fact,
routinely collected patient healthcare data are now approaching
the genomic scale in volume and complexity.1
Unfortunately,
most of this information is not yet used in the sorts of predictive
statistical models clinicians might use to improve care delivery. It
is widely suspected that use of such efforts, if successful, could
provide major benefits not only for patient safety and quality but
also in reducing healthcare costs.2–6
In spite of the richness and potential of available data, scaling
the development of predictive models is difficult because, for
traditional predictive modeling techniques, each outcome to be
predicted requires the creation of a custom dataset with specific
variables.7
It is widely held that 80% of the effort in an analytic
model is preprocessing, merging, customizing, and cleaning
nurses, and other providers are included. Traditional modeling
approaches have dealt with this complexity simply by choosing a
very limited number of commonly collected variables to consider.7
This is problematic because the resulting models may produce
imprecise predictions: false-positive predictions can overwhelm
physicians, nurses, and other providers with false alarms and
concomitant alert fatigue,10
which the Joint Commission identified
as a national patient safety priority in 2014.11
False-negative
predictions can miss significant numbers of clinically important
events, leading to poor clinical outcomes.11,12
Incorporating the
entire EHR, including clinicians’ free-text notes, offers some hope
of overcoming these shortcomings but is unwieldy for most
predictive modeling techniques.
Recent developments in deep learning and artificial neural
networks may allow us to address many of these challenges and
unlock the information in the EHR. Deep learning emerged as the
preferred machine learning approach in machine perception
•2018년 1월 구글이 전자의무기록(EMR)을 분석하여, 환자 치료 결과를 예측하는 인공지능 발표

•환자가 입원 중에 사망할 것인지

•장기간 입원할 것인지

•퇴원 후에 30일 내에 재입원할 것인지

•퇴원 시의 진단명 
•이번 연구의 특징: 확장성

•과거 다른 연구와 달리 EMR의 일부 데이터를 pre-processing 하지 않고,

•전체 EMR 를 통채로 모두 분석하였음: UCSF, UCM (시카고 대학병원)

•특히, 비정형 데이터인 의사의 진료 노트도 분석

LETTERS
https://doi.org/10.1038/s41591-018-0335-9
1
Guangzhou Women and Children’s Medical Center, Guangzhou Medical University, Guangzhou, China. 2
Institute for Genomic Medicine, Institute of
Engineering in Medicine, and Shiley Eye Institute, University of California, San Diego, La Jolla, CA, USA. 3
Hangzhou YITU Healthcare Technology Co. Ltd,
Hangzhou, China. 4
Department of Thoracic Surgery/Oncology, First Affiliated Hospital of Guangzhou Medical University, China State Key Laboratory and
National Clinical Research Center for Respiratory Disease, Guangzhou, China. 5
Guangzhou Kangrui Co. Ltd, Guangzhou, China. 6
Guangzhou Regenerative
Medicine and Health Guangdong Laboratory, Guangzhou, China. 7
Veterans Administration Healthcare System, San Diego, CA, USA. 8
These authors contributed
equally: Huiying Liang, Brian Tsui, Hao Ni, Carolina C. S. Valentim, Sally L. Baxter, Guangjian Liu. *e-mail: kang.zhang@gmail.com; xiahumin@hotmail.com
Artificial intelligence (AI)-based methods have emerged as
powerful tools to transform medical care. Although machine
learning classifiers (MLCs) have already demonstrated strong
performance in image-based diagnoses, analysis of diverse
and massive electronic health record (EHR) data remains chal-
lenging. Here, we show that MLCs can query EHRs in a manner
similar to the hypothetico-deductive reasoning used by physi-
cians and unearth associations that previous statistical meth-
ods have not found. Our model applies an automated natural
language processing system using deep learning techniques
to extract clinically relevant information from EHRs. In total,
101.6 million data points from 1,362,559 pediatric patient
visits presenting to a major referral center were analyzed to
train and validate the framework. Our model demonstrates
high diagnostic accuracy across multiple organ systems and is
comparable to experienced pediatricians in diagnosing com-
mon childhood diseases. Our study provides a proof of con-
cept for implementing an AI-based system as a means to aid
physiciansintacklinglargeamountsofdata,augmentingdiag-
nostic evaluations, and to provide clinical decision support in
cases of diagnostic uncertainty or complexity. Although this
impact may be most evident in areas where healthcare provid-
ers are in relative shortage, the benefits of such an AI system
are likely to be universal.
Medical information has become increasingly complex over
time. The range of disease entities, diagnostic testing and biomark-
ers, and treatment modalities has increased exponentially in recent
years. Subsequently, clinical decision-making has also become more
complex and demands the synthesis of decisions from assessment
of large volumes of data representing clinical information. In the
current digital age, the electronic health record (EHR) represents a
massive repository of electronic data points representing a diverse
array of clinical information1–3
. Artificial intelligence (AI) methods
have emerged as potentially powerful tools to mine EHR data to aid
in disease diagnosis and management, mimicking and perhaps even
augmenting the clinical decision-making of human physicians1
.
To formulate a diagnosis for any given patient, physicians fre-
quently use hypotheticodeductive reasoning. Starting with the chief
complaint, the physician then asks appropriately targeted questions
relating to that complaint. From this initial small feature set, the
physician forms a differential diagnosis and decides what features
(historical questions, physical exam findings, laboratory testing,
and/or imaging studies) to obtain next in order to rule in or rule
out the diagnoses in the differential diagnosis set. The most use-
ful features are identified, such that when the probability of one of
the diagnoses reaches a predetermined level of acceptability, the
process is stopped, and the diagnosis is accepted. It may be pos-
sible to achieve an acceptable level of certainty of the diagnosis with
only a few features without having to process the entire feature set.
Therefore, the physician can be considered a classifier of sorts.
In this study, we designed an AI-based system using machine
learning to extract clinically relevant features from EHR notes to
mimic the clinical reasoning of human physicians. In medicine,
machine learning methods have already demonstrated strong per-
formance in image-based diagnoses, notably in radiology2
, derma-
tology4
, and ophthalmology5–8
, but analysis of EHR data presents
a number of difficult challenges. These challenges include the vast
quantity of data, high dimensionality, data sparsity, and deviations
Evaluation and accurate diagnoses of pediatric
diseases using artificial intelligence
Huiying Liang1,8
, Brian Y. Tsui 2,8
, Hao Ni3,8
, Carolina C. S. Valentim4,8
, Sally L. Baxter 2,8
,
Guangjian Liu1,8
, Wenjia Cai 2
, Daniel S. Kermany1,2
, Xin Sun1
, Jiancong Chen2
, Liya He1
, Jie Zhu1
,
Pin Tian2
, Hua Shao2
, Lianghong Zheng5,6
, Rui Hou5,6
, Sierra Hewett1,2
, Gen Li1,2
, Ping Liang3
,
Xuan Zang3
, Zhiqi Zhang3
, Liyan Pan1
, Huimin Cai5,6
, Rujuan Ling1
, Shuhua Li1
, Yongwang Cui1
,
Shusheng Tang1
, Hong Ye1
, Xiaoyan Huang1
, Waner He1
, Wenqing Liang1
, Qing Zhang1
, Jianmin Jiang1
,
Wei Yu1
, Jianqun Gao1
, Wanxing Ou1
, Yingmin Deng1
, Qiaozhen Hou1
, Bei Wang1
, Cuichan Yao1
,
Yan Liang1
, Shu Zhang1
, Yaou Duan2
, Runze Zhang2
, Sarah Gibson2
, Charlotte L. Zhang2
, Oulan Li2
,
Edward D. Zhang2
, Gabriel Karin2
, Nathan Nguyen2
, Xiaokang Wu1,2
, Cindy Wen2
, Jie Xu2
, Wenqin Xu2
,
Bochu Wang2
, Winston Wang2
, Jing Li1,2
, Bianca Pizzato2
, Caroline Bao2
, Daoman Xiang1
, Wanting He1,2
,
Suiqin He2
, Yugui Zhou1,2
, Weldon Haw2,7
, Michael Goldbaum2
, Adriana Tremoulet2
, Chun-Nan Hsu 2
,
Hannah Carter2
, Long Zhu3
, Kang Zhang 1,2,7
* and Huimin Xia 1
*
LETTERSNATURE MEDICINE
examination, laboratory testing, and PACS (picture archiving and
communication systems) reports), the F1 scores exceeded 90%
except in one instance, which was for categorical variables detected
tree, similar to how a human physician might evaluate a patient’s
features to achieve a diagnosis based on the same clinical data
incorporated into the information model. Encounters labeled by
Systemic generalized diseases
Varicella without complication
Influenza
Infectious mononucleosis
Sepsis
Exanthema subitum
Neuropsychiatric diseases
Tic disorder
Attention-deficit hyperactivity disorders
Bacterial meningitis
Encephalitis
Convulsions
Genitourinary diseases
Respiratory diseases
Upper respiratory
diseases
Acute upper respiratory infection
Sinusitis
Acute sinusitis
Acute recurrent sinusitis
Acute laryngitis
Acute pharyngitis
Lower respiratory
diseases
Bronchitis
Acute bronchitis
Bronchiolitis
Acute bronchitis due to Mycoplasma pneumoniae
Pneumonia
Bacterial pneumonia
Bronchopneumonia
Bacterial pneumonia elsewhere
Mycoplasma infection
Asthma
Asthma uncomplicated
Cough variant asthma
Asthma with acute exacerbation
Acute tracheitis
Gastrointestinal diseases
Diarrhea
Mouth-related diseases
Enteroviral vesicular stomatitis
with exanthem
Fig. 2 | Hierarchy of the diagnostic framework in a large pediatric cohort. A hierarchical logistic regression classifier was used to establish a diagnostic
system based on anatomic divisions. An organ-based approach was used, wherein diagnoses were first separated into broad organ systems, then
subsequently divided into organ subsystems and/or into more specific diagnosis groups.
•소아 환자 130만 명의 EMR 데이터 101.6 million 개 분석

•딥러닝 기반의 자연어 처리 기술

•의사의 hypothetico-deductive reasoning 모방

•소아 환자의 common disease를 진단하는 인공지능
Nat Med 2019 Feb

GP at Hand
•영국 바빌론헬스의 GP at Hand

•챗봇 기반의 질병 진단 + 원격 진료

•영국 NHS 에서 활용 중

디지털 표현형

Digital Phenotype

Digital Phenotype:
Your smartphone knows if you are depressed
Ginger.io

Digital Phenotype:
Your smartphone knows if you are depressed
J Med Internet Res. 2015 Jul 15;17(7):e175.
The correlation analysis between the features and the PHQ-9 scores revealed that 6 of the 10
features were signiﬁcantly correlated to the scores:
• strong correlation: circadian movement, normalized entropy, location variance
• correlation: phone usage features, usage duration and usage frequency

Mindstrong Health
• 스마트폰 사용 패턴을 바탕으로

• 인지능력, 우울증, 조현병, 양극성 장애, PTSD 등을 측정

• 미국 국립정신건강연구소 소장인 Tomas Insel 이 공동 설립

• 아마존의 제프 베조스 투자

Paul Dagum1
INTRODUCTION
RESULTS
between the
of the tests.
p-value
Working memory
Digits forward 10.9 (2.7) 7–15 0.71 ± 0.10, 10−4
Digits backward 8.3 (2.7) 4–14 0.75 ± 0.08, 10−5
Executive function
Trail A 23.0 (7.6) 12–39 0.70 ± 0.10, 10−4
Trail B 53.3 (13.1) 37–88 0.82 ± 0.06, 10−6
Language
Animal ﬂuency 22.5 (3.8) 15–30 0.67 ± 0.11, 10−4
Dexterity
(dominant hand)
62.7 (6.7) 51–75 0.73 ± 0.09, 10−4
Memory
14.1 (1.9) 9–16 0.62 ± 0.12, 10−3
29.4 (6.2) 18–42 0.81 ± 0.07, 10−6
10.2 (1.8) 5–12 0.77 ± 0.08, 10−5
Intelligence scale
1
• 총 45가지 스마트폰 사용 패턴: 타이핑, 스크롤, 화면 터치

• 스페이스바 누른 후, 다음 문자 타이핑하는 행동

• 백스페이스를 누른 후, 그 다음 백스페이스

• 주소록에서 사람을 찾는 행동 양식 
• 스마트폰 사용 패턴과 인지 능력의 상관 관계

• 20-30대 피험자 27명

• Working Memory, Language, Dexterity etc

Paul Dagum1
INTRODUCTION
RESULTS
between the
of the tests.
p-value
Working memory
Digits forward 10.9 (2.7) 7–15 0.71 ± 0.10, 10−4
Digits backward 8.3 (2.7) 4–14 0.75 ± 0.08, 10−5
Executive function
Trail A 23.0 (7.6) 12–39 0.70 ± 0.10, 10−4
Trail B 53.3 (13.1) 37–88 0.82 ± 0.06, 10−6
Language
Animal ﬂuency 22.5 (3.8) 15–30 0.67 ± 0.11, 10−4
Dexterity
(dominant hand)
62.7 (6.7) 51–75 0.73 ± 0.09, 10−4
Memory
14.1 (1.9) 9–16 0.62 ± 0.12, 10−3
29.4 (6.2) 18–42 0.81 ± 0.07, 10−6
10.2 (1.8) 5–12 0.77 ± 0.08, 10−5
Intelligence scale
1
Fig. 1 A blue square represents a participant test Z-score normed to the 27 participant scores and a red circle represents the digital biomarker
prediction Z-score normed to the 27 predictions. Test scores and predictions shown are a digits backward, b symbol digit modality, c animal
ﬂuency, d Wechsler memory Scale-3rd Edition (WMS-III) logical memory (delayed free recall), e brief visuospatial memory test (delayed free
recall), and f Wechsler adult intelligence scale-4th Edition (WAIS-IV) block design
P Dagum
2
1234567890():,;
• 스마트폰 사용 패턴과 인지 능력의 높은 상관 관계

• 파란색: 표준 인지 능력 테스트 결과

• 붉은색: 마인드 스트롱의 스마트폰 사용 패턴

NATURE MEDICINE
and the algorithm led to the best accuracy, and the algorithm mark-
edly sped up the review of slides35
. This study is particularly notable,
41
Table 2 | FDA AI approvals are accelerating
Company FDA Approval Indication
Apple September 2018 Atrial fibrillation detection
Aidoc August 2018 CT brain bleed diagnosis
iCAD August 2018 Breast density via
mammography
Zebra Medical July 2018 Coronary calcium scoring
Bay Labs June 2018 Echocardiogram EF
determination
Neural Analytics May 2018 Device for paramedic stroke
diagnosis
IDx April 2018 Diabetic retinopathy diagnosis
Icometrix April 2018 MRI brain interpretation
Imagen March 2018 X-ray wrist fracture diagnosis
Viz.ai February 2018 CT stroke diagnosis
Arterys February 2018 Liver and lung cancer (MRI, CT)
diagnosis
MaxQ-AI January 2018 CT brain bleed diagnosis
Alivecor November 2017 Atrial fibrillation detection via
Apple Watch
Arterys January 2017 MRI heart interpretation
NATURE MEDICINE
인공지능 기반 의료기기  
FDA 인허가 현황
Nature Medicine 2019
• Zebra Medical Vision

• 2019년 5월: 흉부 엑스레이에서 기흉 판독

• 2019년 6월: head CT 에서 뇌출혈 판독

• Aidoc

• 2019년 5월: CT에서 폐색전증 판독

• 2019년 6월: CT에서 경추골절 판독
+

인공지능 기반 의료기기  
국내 인허가 현황
• 1. 뷰노 본에이지 (2등급 허가)

• 2. 루닛 인사이트 폐결절 (2등급 허가)

• 3. JLK 뇌경색 (3등급 허가)

• 4. 인포메디텍 뉴로아이 (2등급 인증): MRI 기반 치매 진단 보조 
• 5. 삼성전자 폐결절 (2등급 허가)

• 6. 뷰노 딥브레인 (2등급 인증)

• 7. 루닛 인사이트 MMG (3등급 허가)

• 8. JLK인스펙션 ATROSCAN (2등급 인증) 건강검진용 뇌 노화 측정

• 9. 뷰노 체스트엑스레이 (2등급 허가)

• 10. 딥노이드 딥스파인 (2등급 허가): X-ray 요추 압박골절 검출보조
2018년
2019년

•Some polyps were detected with only partial appearance.

•detected in both normal and insufficient light condition.

•detected under both qualified and suboptimal bowel preparations.
ARTICLESNATURE BIOMEDICAL ENGINEERING
from patients who underwent colonoscopy examinations up to 2
years later.
Also, we demonstrated high per-image-sensitivity (94.38%
and 91.64%) in both the image (datasetA) and video (datasetC)
analyses. DatasetsA and C included large variations of polyp mor-
phology and image quality (Fig. 3, Supplementary Figs. 2–5 and
Supplementary Videos 3 and 4). For images with only flat and iso-
datasets are often small and do not represent the full range of colon
conditions encountered in the clinical setting, and there are often
discrepancies in the reporting of clinical metrics of success such as
sensitivity and specificity19,20,26
. Compared with other metrics such
as precision, we believe that sensitivity and specificity are the most
appropriate metrics for the evaluation of algorithm performance
because of their independence on the ratio of positive to negative
Fig. 3 | Examples of polyp detection for datasetsA and C. Polyps of different morphology, including flat isochromatic polyps (left), dome-shaped polyps
(second from left, middle), pedunculated polyps (second from right) and sessile serrated adenomatous polyps (right), were detected by the algorithm
(as indicated by the green tags in the bottom set of images) in both normal and insufficient light conditions, under both qualified and suboptimal bowel
preparations. Some polyps were detected with only partial appearance (middle, second from right). See Supplementary Figs 2–6 for additional examples.
ﬂat isochromatic polyps dome-shaped polyps sessile serrated adenomatous polypspedunculated polyps
대장내시경에서의 용종 발견 보조 인공지능

Endoscopy
Prosp
The
trial
syste
ADR
the S
patie
to Fe
bowe
given
defin
high
infla
Figure 1 Deep learning architecture.The detection algorithm is a deep
convolutional neural network (CNN) based on SegNet architecture. Data
flow is from left to right: a colonoscopy image is sequentially warped
into a binary image, with 1 representing polyp pixels and 0 representing
no polyp in probability a map.This is then displayed, as showed in the
output, with a hollow tracing box on the CADe monitor.
1Wang P, et al. Gut 2019;0:1–7. doi:10.1136/gutjnl-2018-317500
Endoscopy
ORIGINAL ARTICLE
Real-time automatic detection system increases
colonoscopic polyp and adenoma detection rates: a
prospective randomised controlled study
Pu Wang, 1
Tyler M Berzin, 2
Jeremy Romek Glissen Brown, 2
Shishira Bharadwaj,2
Aymeric Becq,2
Xun Xiao,1
Peixi Liu,1
Liangping Li,1
Yan Song,1
Di Zhang,1
Yi Li,1
Guangre Xu,1
Mengtian Tu,1
Xiaogang Liu 1
To cite: Wang P, Berzin TM,
Glissen Brown JR, et al. Gut
Epub ahead of print: [please
include Day Month Year].
doi:10.1136/
gutjnl-2018-317500
► Additional material is
published online only.To view
please visit the journal online
(http://dx.doi.org/10.1136/
gutjnl-2018-317500).
1
Department of
Gastroenterology, Sichuan
Sichuan Provincial People’s
Hospital, Chengdu, China
2
Center for Advanced
Endoscopy, Beth Israel
Deaconess Medical Center and
Massachusetts, USA
Correspondence to
Xiaogang Liu, Department
of Gastroenterology Sichuan
and Sichuan Provincial People’s
Hospital, Chengdu, China;
Gary.samsph@gmail.com
Received 30 August 2018
Revised 4 February 2019
Accepted 13 February 2019
© Author(s) (or their
employer(s)) 2019. Re-use
permitted under CC BY-NC. No
commercial re-use. See rights
and permissions. Published
by BMJ.
ABSTRACT
Objective The effect of colonoscopy on colorectal
cancer mortality is limited by several factors, among them
a certain miss rate, leading to limited adenoma detection
rates (ADRs).We investigated the effect of an automatic
polyp detection system based on deep learning on polyp
detection rate and ADR.
Design In an open, non-blinded trial, consecutive
patients were prospectively randomised to undergo
diagnostic colonoscopy with or without assistance of a
real-time automatic polyp detection system providing
a simultaneous visual notice and sound alarm on polyp
detection.The primary outcome was ADR.
Results Of 1058 patients included, 536 were
randomised to standard colonoscopy, and 522 were
randomised to colonoscopy with computer-aided
diagnosis.The artificial intelligence (AI) system
significantly increased ADR (29.1%vs20.3%, p0.001)
and the mean number of adenomas per patient
(0.53vs0.31, p0.001).This was due to a higher number
of diminutive adenomas found (185vs102; p0.001),
while there was no statistical difference in larger
adenomas (77vs58, p=0.075). In addition, the number
of hyperplastic polyps was also significantly increased
(114vs52, p0.001).
Conclusions In a low prevalent ADR population, an
automatic polyp detection system during colonoscopy
resulted in a significant increase in the number of
diminutive adenomas detected, as well as an increase in
the rate of hyperplastic polyps.The cost–benefit ratio of
such effects has to be determined further.
Trial registration number ChiCTR-DDD-17012221;
Results.
INTRODUCTION
Colorectal cancer (CRC) is the second and third-
leading causes of cancer-related deaths in men and
women respectively.1
Colonoscopy is the gold stan-
dard for screening CRC.2 3
Screening colonoscopy
has allowed for a reduction in the incidence and
mortality of CRC via the detection and removal
of adenomatous polyps.4–8
Additionally, there is
evidence that with each 1.0% increase in adenoma
detection rate (ADR), there is an associated 3.0%
decrease in the risk of interval CRC.9 10
However,
polyps can be missed, with reported miss rates of
up to 27% due to both polyp and operator charac-
teristics.11 12
Unrecognised polyps within the visual field is
an important problem to address.11
Several studies
have shown that assistance by a second observer
increases the polyp detection rate (PDR), but such a
strategy remains controversial in terms of increasing
the ADR.13–15
Ideally, a real-time automatic polyp detec-
tion system, with performance close to that of
expert endoscopists, could assist the endosco-
pist in detecting lesions that might correspond to
adenomas in a more consistent and reliable way
Significance of this study
What is already known on this subject?
► Colorectal adenoma detection rate (ADR)
is regarded as a main quality indicator of
(screening) colonoscopy and has been shown
to correlate with interval cancers. Reducing
adenoma miss rates by increasing ADR has
been a goal of many studies focused on
imaging techniques and mechanical methods.
► Artificial intelligence has been recently
introduced for polyp and adenoma detection
as well as differentiation and has shown
promising results in preliminary studies.
What are the new findings?
► This represents the first prospective randomised
controlled trial examining an automatic polyp
detection during colonoscopy and shows an
increase of ADR by 50%, from 20% to 30%.
► This effect was mainly due to a higher rate of
small adenomas found.
► The detection rate of hyperplastic polyps was
also significantly increased.
How might it impact on clinical practice in the
foreseeable future?
► Automatic polyp and adenoma detection could
be the future of diagnostic colonoscopy in order
to achieve stable high adenoma detection rates.
► However, the effect on ultimate outcome is
still unclear, and further improvements such as
polyp differentiation have to be implemented.
on17March2019byguest.Protectedbycopyright.http://gut.bmj.com/Gut:firstpublishedas10.1136/gutjnl-2018-317500on27February2019.Downloadedfrom
• 이 인공지능이 polyp detection과 adenoma detection에 실제 도움을 줌

• Prospective RCT로 증명 (n=1058; standard=536, CAD=522)

• 인공지능의 도움으로

• adenoma detection rate 증가: 29.1% vs 20.3% (p0.001)

• 환자당 detection 된 adenoma 수의 증가: 0.53 vs 0.31 (p0.001)

• 이는 주로 diminutive adenomas 를 더 많이 찾았기 때문: 185 vs 102 (p0.001)

• hyperplastic polyps의 수 증가: 114 vs 52 (p0.001)

글로벌 디지털 헬스케어 산업 및 규제 동향

글로벌 디지털 헬스케어 산업 및 규제 동향

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à 글로벌 디지털 헬스케어 산업 및 규제 동향

Similaire à 글로벌 디지털 헬스케어 산업 및 규제 동향 (20)

Plus de Yoon Sup Choi

Plus de Yoon Sup Choi (12)

Dernier

Dernier (20)

글로벌 디지털 헬스케어 산업 및 규제 동향