Contenu connexe Similaire à Knowledge Engineering from Big Data in Oncology (20) Knowledge Engineering from Big Data in Oncology2. 2
© MAASTRO 2015
Disclosures
Research collaborations incl. funding / honoraria etc.
– Varian (VATE, chinaCAT, euroCAT), Siemens (euroCAT), Sohard (SeDI,
CloudAtlas), Mirada Medical (CloudAtlas), Philips (EURECA, TraIT, SWIFT-RT),
Xerox (EURECA), De Praktijkindex (DLRA)
Public research funding
– Radiomics (USA-NIH/U01CA143062), euroCAT(EU-Interreg), duCAT (NL-
STW), EURECA (EU-FP7), SeDI & CloudAtlas (EU-EUREKA), TraIT (NL-
CTMM), DLRA (NL-NVRO)
Spin-offs and commercial ventures
– MAASTRO Innovations B.V. (CSO)
– Various patents on medical machine learning
6. 6
© MAASTRO 2015
Testing predictions by MDs
Lung cancer
2 year survival
158 patients
5 MDs
Prospective
AUC: 0.56
Oberije et al.
Kruger et al. 1999
Unskilled and unaware of it: How difficulties in
recognizing one’s own incompetence leads to inflated
self-assessments. J Pers Soc Psych
7. 7
© MAASTRO 2015
The doctor is drowning
• Explosion of data
• Explosion of decisions
• Explosion of ‘evidence’*
• 3 % in trials, bias
• Sharp knife
*2010: 1574 & 1354 articles on lung cancer & radiotherapy = 7.5 per day
Half-life of knowledge estimated at 7 years (in young students)
Source: J Clin Oncol 2010;28:4268
Source: JMI 2012 Friedman, Rigby
8. 8
© MAASTRO 2015
Main Opportunity of Big Data Driven Medicine : Rapid Learning
Health Care / Precision Medicine / Predict outcome in an individual
In [..] rapid-learning [..] data routinely
generated through patient care and
clinical research feed into an ever-
growing [..] set of coordinated
databases.
J Clin Oncol 2010;28:4268
[..] rapid learning [..] where we can
learn from each patient to guide
practice, is [..] crucial to guide rational
health policy and to contain costs [..].
Lancet Oncol 2011;12:933
Examples:
Radiotherapy CAT (www.eurocat.info)
ASCO’s CancerLinQ
9. 9
© MAASTRO 2015
Why would we want to predict outcome in an individual patient?
If you can’t predict outcomes
Doctor/Patient perspective
• you can’t inform and involve your patient properly
• you might not make the right decision of treatment
A over treatment B
Quality perspective
• you can’t know if your treatments are given the
predicted outcome
Innovation perspective
• you can’t determine which patient (group) we need
to innovate in
Source: www.predictcancer.org (MAASTRO)
Source: www.lifemath.net (MGH)
10. 10
© MAASTRO 2015
Big data in Oncology
Source: Cancer Research UK
Source: Institute for Health Technology Transformation
11. 11
© MAASTRO 2015
Main challenge of using Big Data and Outcomes
Research in Oncology
• You need to learn from other
patients to predict the outcome of
a new patient
• These data are spread out over
100k hospitals
• So we need to share…,
challenges:
• Administrative (I don’t have the
time)
• Political (I don’t want to )
• Ethical (I am not allowed)
• Technical (I can’t) [..] the problem is not really technical […]. Rather, the problems are
ethical, political, and administrative.
Lancet Oncol 2011;12:933
12. 12
© MAASTRO 2015
The ‘standard’ approach
• Sharing standardized, highly curated data from clinical
research programs
• Very useful, but only 3% of patients (if that)
• Worries about privacy, loss of control, limited amount of
features, limited reusability, a lot of work
13. 13
© MAASTRO 2015
A different approach
If sharing is the problem: Don’t share the data
If you can’t bring the data to the learning application
You have to bring the learning application to the data
Consequences
• The learning application has to be distributed
• The data has to be readable by an application (i.e. not a human)
• Solution: Sharing standardized highly curated research data
• Solution: Not-sharing non-standardized non-curated clinical data
15. 15
© MAASTRO 2015
euroCAT, duCAT, chinaCAT, ozCAT, VATE, ukCAT, dkCAT, worldCAT, BIONIC Network
Industry Partners
Active or funded CAT partners (19)
Prospective centers
2
5
Map from cgadvertising.com
5
Clinical / Academic
Partners
16. 16
© MAASTRO 2015
Does it work ? euroCAT’s example
• Distributed = Centralized (ADMM method, Boyd-Stanford)
• Distributed learning better than learning on single center data
• 550 iterations, two hours (centralized < 1 min)
Learn in Validate in AUC
Aachen (n=7) Liège (n=186) 0.61
Eindhoven (n=32) Liège (n=186) 0.72
Hasselt (n=45) Liège (n=186) 0.68
Maastricht (n=52) Liège (n=186) 0.75
All 4 together (n=136) Liège (n=186) 0.77
All 5 together (n=322) World (n=inf) ?
17. 17
© MAASTRO 2015
Summary
Knowledge Engineering from Big Data in Oncology
• The challenge of Big Data in oncology
• Is not the size but the distribution
• Is imaging and not genomics (for now)
• The aim of Knowledge Engineering is
• To predict outcomes better via prediction models
• To update these models continuously in rapid learning
18. 18
© MAASTRO 2015
Acknowledgements
• Varian, Palo Alto, CA, USA
• Siemens, Malvern, PA, USA
• RTOG, Philadelphia, PA, USA
• MAASTRO, Maastricht, Netherlands
• Policlinico Gemelli, Roma, Italy
• UH Ghent, Belgium
• Catherina Zkh Eindhoven, Netherlands
• UZ Leuven, Belgium
• Radboud, Nijmegen, Netherlands
• University of Sydney, Australia
• Liverpool and Macarthur CC, Australia
• CHU Liege, Belgium
• Uniklinikum Aachen, Germany
• LOC Genk/Hasselt, Belgium
• Princess Margaret Hospital, Canada
• The Christie, Manchester, UK
• UH Leuven, Belgium
• State Hospital, Rovigo, Italy
• Illawarra Shoalhaven CC, Australia
• Fudan Cancer Center, Shanghai, China
More info on: www.predictcancer.org www.cancerdata.org
www.eurocat.info www.mistir.info
19. Andre Dekker, PhD
Medical Physicist
MAASTRO Clinic
Thank you for your attention
More info on:
www.eurocat.info
www.predictcancer.org
www.cancerdata.org
www.mistir.info
www.maastro.nl