SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
AI FOR PRECISION MEDICINE
PRAGMATIC PRECLINICAL DATA SCIENCE
Paul Agapow <p.agapow@imperial.ac.uk>

Data Science Institute, Imperial College London
Pharma AI & IoT (London, July 2018)
MLMH2018	-	KDD	Workshop	on	Machine	
Learning	for	Medicine	and	Healthcare	
August	20,	2018,	London,	UK	
Topics	of	interest:	
•  Data	Standards	for	Translational	
Medicine	Informatics	
•  Analysis	of	large	scale	electronic	
health	records	or	patient-
generated	health	data	records	
•  Visualisation	of	complex	and	
dynamic	biomedical	networks	
•  Disease	Subtype	Discovery	for	
Precision	Medicine	
•  Interpretable	Machine	Learning	for	
biomedicine	and	healthcare	
•  Deep	learning	for	biomedicine	
Important	Dates	
•  Submission	deadline:	
	May	25,	2018	
•  Notification	accept:		
June	8,	2018	
•  Workshop	date:		
August	8,	2018	
Meet	our	Panel!	
T.	Roy	(Ph.D),	University	of	
Southampton,	UK		
A.	Teredesai	(PhD),	University	of	
Washington,	Tacoma	
S.	Wagers	(MD),	CEO/Founder	
BioSci	Consulting,	Belgium	
Join	us	during	the	KDD	Health	Day!	
	
	
	
Win	IBM	$1,000	travel	grant	for	best	
selected	student	paper!	
	
Follow	us!	
https://mlmhworkshop.github.io/mlmh-2018	
Twitter:	
Contact	us:	
mlmhworkshop@googlegroups.com	
	
Organizers:	
M.	Saqi,	Imperial	College	London,	UK	
P.	Chakraborty,	IBM	Research,	USA	
I.	Balaur,	EISBM,	Lyon,	France	
P.	Agapow,	Imperial	College	London,	UK	
S.	Wagers,	BioSci	Consulting,	Belgium	
P.Y.	S.	Hsueh,	IBM	Research,	USA	
F.	Rahmanian,	Geneia,	USA	
M.A.	Ahmad,	Kensci	Inc.	and	University	of	
Washington	-	Tacoma,	USA
BACKGROUND & DISCLOSURE
➤ Data Science Institute (Imperial
College London)
➤ Novel & advanced computation over
large rich biomedical datasets for
translational research & precision
medicine
➤ Patient subtype discovery &
mechanistic insight
➤ Scientific Advisor to PangaeaData.ai
THE “AI WINTER”
THE DATA PROBLEM
“Nice training set. Where’s your data?
- An Analyst
BIG BIOMEDICAL DATA USUALLY ISN’T
➤ Average trial size on
ClinicalTrials.gov < 100
➤ Average #samples per GEO
dataset < 100
➤ Average GWAS cohort size
~9000 (median ~2500)
➤ 1,064 ICU admissions for flu in
UK 2016/2017 season
➤ Curse of dimensionality
➤ Deep learning requires
“thousands” of samples for
training (at least p2?)
➤ GWAS needs 3K+ for large
effects, 10K or more for small
effects …
➤ Sub-populations & rare diseases
will be smaller
VS
MAKE BIGGER DATASETS
➤ “Allow” reuse & combining not “build”
➤ FAIR
➤ Use standards like CDISC, HPO …
➤ eTRIKS
➤ Data intensive translational research
➤ Sharing data (standards, starter kit)
➤ Data catalog of ~70 studies
➤ EHDN / EHDEN
➤ European Health Data and Evidence
Network
➤ Harmonised model for accessing health data
WE NEED MORE ETL
➤ Too damn slow and expensive
➤ Tools are poor
➤ Humans are inconsistent
➤ Standards are complex
➤ Harmonisation by ML is the only
answer
➤ Learn from data examples
➤ Corrected by humans
➤ “Discover” schema if need be
1
2
3
4
1
2
3
4
Text data
Tabular data
§ Frequent Pattern Mining-Growth Algorithms to
determine schema association rules
§ Word2Vec to condense information of text sequence and
context
§ Graph-Theoretical Algorithms to determine logical
sequences, followers, associations, matchings
§ Decision Trees, Neural Nets and Support Vector
Machines for training the model
§ Custom Algorithms to prepare data and check data quality
Pre-classified
data and master
data mappingsData
extractor
Data
extractor
From PangaeaData.AI
EXAMPLE: U-BIOPRED
➤ Unbiased BIOmarkers in PREDiction of
respiratory disease outcomes
➤ 900+ patients, 16 clinical centres +
other studies combined via standards
➤ Outputs:
➤ Analyses largely on small subsets
(~100)
➤ Subtyping of asthmatics
➤ 40+ academic publications
THE METHOD PROBLEM
THE REALITY OF DEEP LEARNING
➤ Deep learning is still in progress
➤ Usually insufficient (good labelled)
data
➤ Interpretability issues
➤ Legal & ethical issues, federated
analysis
➤ Tells you what you’ve told it
➤ Bias towards images
➤ For now …
DEEP LEARNING WITH LESS DATA
➤ Pre-training (data without labels)
➤ Initial training with mediocre data
➤ Adapt
➤ Transfer learning (labels / output changes)
➤ Domain adaptation (data / input changes)
➤ Data augmentation
➤ Interpretability coming slowly (LIME)
Dielman 2015
“80% of the time, you can get 80% of the way
with a simple decision tree.
- Doug Mcilwraith (paraphrased)
EXAMPLE: TEXT CLASSIFICATION FOR SYSTEMATIC REVIEWS
➤ Aim: find similar or related
publications within corpus
➤ Actual aim: find which
which method of text
classification is
“best” (Validation)
➤ Data: 15 Drug Control
Reviews & Neuropathic
Pain dataset
➤ Classify with random forest,
naive bayes, SVM & CNNs
Conclusion
Dataset WSS Classifier Dataset WSS Classifier
ACE Inhibitors 0.26 SVM NSAIDS 0.14 SVM
ADHD 0.35 MNB Opioids 0.23 SVM
Antihistamines 0.19 MNB Oral
Hypoglycemics
0.21 SVM
Atypical
Antipsychotics
0.12 SVM PPI 0.17 SVM
Beta Blockers 0.13 SVM Skeletal Muscle
Relaxants
0.21 SVM
CCB 0.21 SVM Statins 0.19 SVM
Estrogen 0.25 SVM Triptans 0.22 SVM
Neuropathic Pain 0.61 CNN Urinary
Incontinence
0.25 SVM
THE CONTEXT PROBLEM
OMICS IS ONLY ONE TYPE OF INFORMATION
➤ We don’t have enough data
➤ Methods may not work
➤ Results may be artefactual
➤ But there is other information …
EHR
interactome
devices
RWE
social media
chemistry
evolution / phylogeny
etc.
MULTI-OMICS OR INTEGRATED ANALYSIS
➤ Why?
➤ One way to get more data
➤ Statistical power
➤ Multiple defects required to drive
endogenous disease
➤ Multiple “views” on condition
➤ How?
➤ Cluster / network individual data
layers
➤ Fuse together for consensus
Nemutlu 2012
EXAMPLE: ASTHMA ENDOTYPING
➤ Asthma is highly heterogenous
➤ Symptoms
➤ Response to interventions
➤ Multiple mechanisms
➤ 3 or 4 or 7 clusters …
➤ Carefully curated data from U-
BIOPRED (~100)
➤ Multi-method, multi-data analysis
Wiki Commons
ASTHMA ENDOTYPES
➤ Use a variety of clustering approaches
over asthma cohort ‘omics data
(bayesian, spectral, iCluster)
➤ Use multi-omics approaches (SNF,
NNMF)
➤ Assess agreement / coherence
➤ Validate in pathways, in other cohorts
and in other data types
CONCLUSIONS
➤ Big biomedical data is often not big, but we can make it bigger
➤ Sometimes [Big | Deep | Advanced] approaches are useful, sometimes not: choose
wisely
➤ Contextual information is vital, both for primary analysis and for validation
THANKS
➤ Data Science Institute, ICL
➤ Fayzal Ghantiwala (Bloomberg)
➤ Nazanin Zounemat Kermani (ICL)
➤ Mansoor Saqi (ICL / KCL)
➤ Romain Guédon (Nantes)
➤ Yike Guo (ICL)
➤ eTRIKS consortium
➤ U-BIOPRED consortium

Contenu connexe

Tendances

Accelerating multiple medicinal chemistry projects using Artificial Intellige...
Accelerating multiple medicinal chemistry projects using Artificial Intellige...Accelerating multiple medicinal chemistry projects using Artificial Intellige...
Accelerating multiple medicinal chemistry projects using Artificial Intellige...Al Dossetter
 
Digital webinar master deck final
Digital webinar master deck finalDigital webinar master deck final
Digital webinar master deck finalPistoia Alliance
 
Artificial intelligence in drug discovery
Artificial intelligence in drug discoveryArtificial intelligence in drug discovery
Artificial intelligence in drug discoveryRAVINDRABABUKOPPERA
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Ewout Steyerberg
 
Data science in health care
Data science in health careData science in health care
Data science in health careChetan Khanzode
 
DataPharmaNovember2016
DataPharmaNovember2016DataPharmaNovember2016
DataPharmaNovember2016Pfizer
 
How Artificial Intelligence in Transforming Pharma
How Artificial Intelligence in Transforming PharmaHow Artificial Intelligence in Transforming Pharma
How Artificial Intelligence in Transforming PharmaTyrone Systems
 
Artificial Intelligence for Discovery
Artificial Intelligence for DiscoveryArtificial Intelligence for Discovery
Artificial Intelligence for DiscoveryDayOne
 
Filling the gaps in translational research
Filling the gaps in translational researchFilling the gaps in translational research
Filling the gaps in translational researchPaul Agapow
 
Machine learning in medicine: calm down
Machine learning in medicine: calm downMachine learning in medicine: calm down
Machine learning in medicine: calm downBenVanCalster
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
 
Artificial intelligence and its applications in healthcare and pharmacy
Artificial intelligence and its applications in healthcare and pharmacyArtificial intelligence and its applications in healthcare and pharmacy
Artificial intelligence and its applications in healthcare and pharmacyAtul Adhikari
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Matthieu Schapranow
 
Make clinical prediction models great again
Make clinical prediction models great againMake clinical prediction models great again
Make clinical prediction models great againBenVanCalster
 
Data mining (DM) in the pharmaceutical industry
Data mining (DM) in the pharmaceutical industryData mining (DM) in the pharmaceutical industry
Data mining (DM) in the pharmaceutical industrylurdhu agnes
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
 
MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?Al Dossetter
 

Tendances (19)

Sara Gerke: "AI in Drug Discovery and Clinical Trials"
Sara Gerke: "AI in Drug Discovery and Clinical Trials"Sara Gerke: "AI in Drug Discovery and Clinical Trials"
Sara Gerke: "AI in Drug Discovery and Clinical Trials"
 
Accelerating multiple medicinal chemistry projects using Artificial Intellige...
Accelerating multiple medicinal chemistry projects using Artificial Intellige...Accelerating multiple medicinal chemistry projects using Artificial Intellige...
Accelerating multiple medicinal chemistry projects using Artificial Intellige...
 
Digital webinar master deck final
Digital webinar master deck finalDigital webinar master deck final
Digital webinar master deck final
 
Artificial intelligence in drug discovery
Artificial intelligence in drug discoveryArtificial intelligence in drug discovery
Artificial intelligence in drug discovery
 
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019
 
Data science in health care
Data science in health careData science in health care
Data science in health care
 
DataPharmaNovember2016
DataPharmaNovember2016DataPharmaNovember2016
DataPharmaNovember2016
 
How Artificial Intelligence in Transforming Pharma
How Artificial Intelligence in Transforming PharmaHow Artificial Intelligence in Transforming Pharma
How Artificial Intelligence in Transforming Pharma
 
Artificial Intelligence for Discovery
Artificial Intelligence for DiscoveryArtificial Intelligence for Discovery
Artificial Intelligence for Discovery
 
Filling the gaps in translational research
Filling the gaps in translational researchFilling the gaps in translational research
Filling the gaps in translational research
 
Machine learning in medicine: calm down
Machine learning in medicine: calm downMachine learning in medicine: calm down
Machine learning in medicine: calm down
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
Hands-on Introduction to Machine Learning
Hands-on Introduction to Machine LearningHands-on Introduction to Machine Learning
Hands-on Introduction to Machine Learning
 
Artificial intelligence and its applications in healthcare and pharmacy
Artificial intelligence and its applications in healthcare and pharmacyArtificial intelligence and its applications in healthcare and pharmacy
Artificial intelligence and its applications in healthcare and pharmacy
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?
 
Make clinical prediction models great again
Make clinical prediction models great againMake clinical prediction models great again
Make clinical prediction models great again
 
Data mining (DM) in the pharmaceutical industry
Data mining (DM) in the pharmaceutical industryData mining (DM) in the pharmaceutical industry
Data mining (DM) in the pharmaceutical industry
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatistician
 
MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?MedChemica BigData What Is That All About?
MedChemica BigData What Is That All About?
 

Similaire à AI for Precision Medicine (Pragmatic preclinical data science)

Machine Learning for Preclinical Research
Machine Learning for Preclinical ResearchMachine Learning for Preclinical Research
Machine Learning for Preclinical ResearchPaul Agapow
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical DataPaul Agapow
 
Big biomedical data is a lie
Big biomedical data is a lieBig biomedical data is a lie
Big biomedical data is a liePaul Agapow
 
AAPM Foster July 2009
AAPM Foster July 2009AAPM Foster July 2009
AAPM Foster July 2009Ian Foster
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management inscit2006
 
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Bigfinite
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesJosef Scheiber
 
Big data and machine learning: opportunità per la medicina di precisione e i ...
Big data and machine learning: opportunità per la medicina di precisione e i ...Big data and machine learning: opportunità per la medicina di precisione e i ...
Big data and machine learning: opportunità per la medicina di precisione e i ...Fondazione Giannino Bassetti
 
ai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptxai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptxssuser6b571f
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?Paul Agapow
 
Natural Language Processing to Curate Unstructured Electronic Health Records
Natural Language Processing to Curate Unstructured Electronic Health RecordsNatural Language Processing to Curate Unstructured Electronic Health Records
Natural Language Processing to Curate Unstructured Electronic Health RecordsMMS Holdings
 
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...Pistoia Alliance
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
 
Health research, clinical registries, electronic health records – how do they...
Health research, clinical registries, electronic health records – how do they...Health research, clinical registries, electronic health records – how do they...
Health research, clinical registries, electronic health records – how do they...Koray Atalag
 
AI in Healthcare
AI in HealthcareAI in Healthcare
AI in HealthcarePaul Agapow
 
The Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesThe Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
 
DESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUE
DESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUEDESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUE
DESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUEIRJET Journal
 
OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...
OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...
OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...Shahid Shah
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeakin University
 

Similaire à AI for Precision Medicine (Pragmatic preclinical data science) (20)

Machine Learning for Preclinical Research
Machine Learning for Preclinical ResearchMachine Learning for Preclinical Research
Machine Learning for Preclinical Research
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
 
Big biomedical data is a lie
Big biomedical data is a lieBig biomedical data is a lie
Big biomedical data is a lie
 
AAPM Foster July 2009
AAPM Foster July 2009AAPM Foster July 2009
AAPM Foster July 2009
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use Cases
 
Big data and machine learning: opportunità per la medicina di precisione e i ...
Big data and machine learning: opportunità per la medicina di precisione e i ...Big data and machine learning: opportunità per la medicina di precisione e i ...
Big data and machine learning: opportunità per la medicina di precisione e i ...
 
ai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptxai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptx
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
 
Natural Language Processing to Curate Unstructured Electronic Health Records
Natural Language Processing to Curate Unstructured Electronic Health RecordsNatural Language Processing to Curate Unstructured Electronic Health Records
Natural Language Processing to Curate Unstructured Electronic Health Records
 
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
Sequence analysis in the regulated domain - A Pistoia Alliance Debates webina...
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
Health research, clinical registries, electronic health records – how do they...
Health research, clinical registries, electronic health records – how do they...Health research, clinical registries, electronic health records – how do they...
Health research, clinical registries, electronic health records – how do they...
 
AI in Healthcare
AI in HealthcareAI in Healthcare
AI in Healthcare
 
The Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesThe Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across Scales
 
DESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUE
DESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUEDESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUE
DESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUE
 
OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...
OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...
OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining I
 

Plus de Paul Agapow

Digital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdfDigital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdfPaul Agapow
 
How to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdfHow to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdfPaul Agapow
 
ML, biomedical data & trust
ML, biomedical data & trustML, biomedical data & trust
ML, biomedical data & trustPaul Agapow
 
Where AI will (and won't) revolutionize biomedicine
Where AI will (and won't) revolutionize biomedicineWhere AI will (and won't) revolutionize biomedicine
Where AI will (and won't) revolutionize biomedicinePaul Agapow
 
Beyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AIBeyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AIPaul Agapow
 
Multi-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gainMulti-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gainPaul Agapow
 
ML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the icebergML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the icebergPaul Agapow
 
Get yourself a better bioinformatics job
Get yourself a better bioinformatics jobGet yourself a better bioinformatics job
Get yourself a better bioinformatics jobPaul Agapow
 
Interpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical ResearchInterpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical ResearchPaul Agapow
 
Bioinformatics! (What is it good for?)
Bioinformatics! (What is it good for?)Bioinformatics! (What is it good for?)
Bioinformatics! (What is it good for?)Paul Agapow
 
Patient subtypes: real or not?
Patient subtypes: real or not?Patient subtypes: real or not?
Patient subtypes: real or not?Paul Agapow
 
eTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, LondoneTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, LondonPaul Agapow
 
Introduction to Snakemake
Introduction to SnakemakeIntroduction to Snakemake
Introduction to SnakemakePaul Agapow
 
Analysing biomedical data (ers october 2017)
Analysing biomedical data (ers  october 2017)Analysing biomedical data (ers  october 2017)
Analysing biomedical data (ers october 2017)Paul Agapow
 
Interpreting transcriptomics (ers berlin 2017)
Interpreting transcriptomics (ers berlin 2017)Interpreting transcriptomics (ers berlin 2017)
Interpreting transcriptomics (ers berlin 2017)Paul Agapow
 

Plus de Paul Agapow (15)

Digital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdfDigital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdf
 
How to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdfHow to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdf
 
ML, biomedical data & trust
ML, biomedical data & trustML, biomedical data & trust
ML, biomedical data & trust
 
Where AI will (and won't) revolutionize biomedicine
Where AI will (and won't) revolutionize biomedicineWhere AI will (and won't) revolutionize biomedicine
Where AI will (and won't) revolutionize biomedicine
 
Beyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AIBeyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AI
 
Multi-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gainMulti-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gain
 
ML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the icebergML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the iceberg
 
Get yourself a better bioinformatics job
Get yourself a better bioinformatics jobGet yourself a better bioinformatics job
Get yourself a better bioinformatics job
 
Interpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical ResearchInterpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical Research
 
Bioinformatics! (What is it good for?)
Bioinformatics! (What is it good for?)Bioinformatics! (What is it good for?)
Bioinformatics! (What is it good for?)
 
Patient subtypes: real or not?
Patient subtypes: real or not?Patient subtypes: real or not?
Patient subtypes: real or not?
 
eTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, LondoneTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, London
 
Introduction to Snakemake
Introduction to SnakemakeIntroduction to Snakemake
Introduction to Snakemake
 
Analysing biomedical data (ers october 2017)
Analysing biomedical data (ers  october 2017)Analysing biomedical data (ers  october 2017)
Analysing biomedical data (ers october 2017)
 
Interpreting transcriptomics (ers berlin 2017)
Interpreting transcriptomics (ers berlin 2017)Interpreting transcriptomics (ers berlin 2017)
Interpreting transcriptomics (ers berlin 2017)
 

Dernier

Kukatpally Call Girls Services 9907093804 High Class Babes Here Call Now
Kukatpally Call Girls Services 9907093804 High Class Babes Here Call NowKukatpally Call Girls Services 9907093804 High Class Babes Here Call Now
Kukatpally Call Girls Services 9907093804 High Class Babes Here Call NowHyderabad Call Girls Services
 
Leading transformational change: inner and outer skills
Leading transformational change: inner and outer skillsLeading transformational change: inner and outer skills
Leading transformational change: inner and outer skillsHelenBevan4
 
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service GurgaonCall Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service GurgaonCall Girls Service Gurgaon
 
Call Girls Hyderabad Kirti 9907093804 Independent Escort Service Hyderabad
Call Girls Hyderabad Kirti 9907093804 Independent Escort Service HyderabadCall Girls Hyderabad Kirti 9907093804 Independent Escort Service Hyderabad
Call Girls Hyderabad Kirti 9907093804 Independent Escort Service Hyderabaddelhimodelshub1
 
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...High Profile Call Girls Chandigarh Aarushi
 
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...High Profile Call Girls Chandigarh Aarushi
 
Housewife Call Girls Nandini Layout - Phone No 7001305949 For Ultimate Sexual...
Housewife Call Girls Nandini Layout - Phone No 7001305949 For Ultimate Sexual...Housewife Call Girls Nandini Layout - Phone No 7001305949 For Ultimate Sexual...
Housewife Call Girls Nandini Layout - Phone No 7001305949 For Ultimate Sexual...narwatsonia7
 
Call Girls LB Nagar 7001305949 all area service COD available Any Time
Call Girls LB Nagar 7001305949 all area service COD available Any TimeCall Girls LB Nagar 7001305949 all area service COD available Any Time
Call Girls LB Nagar 7001305949 all area service COD available Any Timedelhimodelshub1
 
Experience learning - lessons from 25 years of ATACC - Mark Forrest and Halde...
Experience learning - lessons from 25 years of ATACC - Mark Forrest and Halde...Experience learning - lessons from 25 years of ATACC - Mark Forrest and Halde...
Experience learning - lessons from 25 years of ATACC - Mark Forrest and Halde...scanFOAM
 
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...narwatsonia7
 
Call Girls Hyderabad Krisha 9907093804 Independent Escort Service Hyderabad
Call Girls Hyderabad Krisha 9907093804 Independent Escort Service HyderabadCall Girls Hyderabad Krisha 9907093804 Independent Escort Service Hyderabad
Call Girls Hyderabad Krisha 9907093804 Independent Escort Service Hyderabaddelhimodelshub1
 
Basics of Anatomy- Language of Anatomy.pptx
Basics of Anatomy- Language of Anatomy.pptxBasics of Anatomy- Language of Anatomy.pptx
Basics of Anatomy- Language of Anatomy.pptxAyush Gupta
 
Gurgaon iffco chowk 🔝 Call Girls Service 🔝 ( 8264348440 ) unlimited hard sex ...
Gurgaon iffco chowk 🔝 Call Girls Service 🔝 ( 8264348440 ) unlimited hard sex ...Gurgaon iffco chowk 🔝 Call Girls Service 🔝 ( 8264348440 ) unlimited hard sex ...
Gurgaon iffco chowk 🔝 Call Girls Service 🔝 ( 8264348440 ) unlimited hard sex ...soniya singh
 

Dernier (20)

Kukatpally Call Girls Services 9907093804 High Class Babes Here Call Now
Kukatpally Call Girls Services 9907093804 High Class Babes Here Call NowKukatpally Call Girls Services 9907093804 High Class Babes Here Call Now
Kukatpally Call Girls Services 9907093804 High Class Babes Here Call Now
 
Leading transformational change: inner and outer skills
Leading transformational change: inner and outer skillsLeading transformational change: inner and outer skills
Leading transformational change: inner and outer skills
 
Call Girl Dehradun Aashi 🔝 7001305949 🔝 💃 Independent Escort Service Dehradun
Call Girl Dehradun Aashi 🔝 7001305949 🔝 💃 Independent Escort Service DehradunCall Girl Dehradun Aashi 🔝 7001305949 🔝 💃 Independent Escort Service Dehradun
Call Girl Dehradun Aashi 🔝 7001305949 🔝 💃 Independent Escort Service Dehradun
 
College Call Girls Dehradun Kavya 🔝 7001305949 🔝 📍 Independent Escort Service...
College Call Girls Dehradun Kavya 🔝 7001305949 🔝 📍 Independent Escort Service...College Call Girls Dehradun Kavya 🔝 7001305949 🔝 📍 Independent Escort Service...
College Call Girls Dehradun Kavya 🔝 7001305949 🔝 📍 Independent Escort Service...
 
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service GurgaonCall Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
Call Girl Gurgaon Saloni 9711199012 Independent Escort Service Gurgaon
 
Call Girls Hyderabad Kirti 9907093804 Independent Escort Service Hyderabad
Call Girls Hyderabad Kirti 9907093804 Independent Escort Service HyderabadCall Girls Hyderabad Kirti 9907093804 Independent Escort Service Hyderabad
Call Girls Hyderabad Kirti 9907093804 Independent Escort Service Hyderabad
 
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
Call Girls Service Bommasandra - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
Call Girl Lucknow Gauri 🔝 8923113531 🔝 🎶 Independent Escort Service Lucknow
Call Girl Lucknow Gauri 🔝 8923113531  🔝 🎶 Independent Escort Service LucknowCall Girl Lucknow Gauri 🔝 8923113531  🔝 🎶 Independent Escort Service Lucknow
Call Girl Lucknow Gauri 🔝 8923113531 🔝 🎶 Independent Escort Service Lucknow
 
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
Call Girls Service Chandigarh Grishma ❤️🍑 9907093804 👄🫦 Independent Escort Se...
 
Russian Call Girls South Delhi 9711199171 discount on your booking
Russian Call Girls South Delhi 9711199171 discount on your bookingRussian Call Girls South Delhi 9711199171 discount on your booking
Russian Call Girls South Delhi 9711199171 discount on your booking
 
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
Russian Call Girls in Chandigarh Ojaswi ❤️🍑 9907093804 👄🫦 Independent Escort ...
 
Housewife Call Girls Nandini Layout - Phone No 7001305949 For Ultimate Sexual...
Housewife Call Girls Nandini Layout - Phone No 7001305949 For Ultimate Sexual...Housewife Call Girls Nandini Layout - Phone No 7001305949 For Ultimate Sexual...
Housewife Call Girls Nandini Layout - Phone No 7001305949 For Ultimate Sexual...
 
Call Girls LB Nagar 7001305949 all area service COD available Any Time
Call Girls LB Nagar 7001305949 all area service COD available Any TimeCall Girls LB Nagar 7001305949 all area service COD available Any Time
Call Girls LB Nagar 7001305949 all area service COD available Any Time
 
Experience learning - lessons from 25 years of ATACC - Mark Forrest and Halde...
Experience learning - lessons from 25 years of ATACC - Mark Forrest and Halde...Experience learning - lessons from 25 years of ATACC - Mark Forrest and Halde...
Experience learning - lessons from 25 years of ATACC - Mark Forrest and Halde...
 
Call Girls Guwahati Aaradhya 👉 7001305949👈 🎶 Independent Escort Service Guwahati
Call Girls Guwahati Aaradhya 👉 7001305949👈 🎶 Independent Escort Service GuwahatiCall Girls Guwahati Aaradhya 👉 7001305949👈 🎶 Independent Escort Service Guwahati
Call Girls Guwahati Aaradhya 👉 7001305949👈 🎶 Independent Escort Service Guwahati
 
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
Hi,Fi Call Girl In Whitefield - [ Cash on Delivery ] Contact 7001305949 Escor...
 
Call Girls Hyderabad Krisha 9907093804 Independent Escort Service Hyderabad
Call Girls Hyderabad Krisha 9907093804 Independent Escort Service HyderabadCall Girls Hyderabad Krisha 9907093804 Independent Escort Service Hyderabad
Call Girls Hyderabad Krisha 9907093804 Independent Escort Service Hyderabad
 
Basics of Anatomy- Language of Anatomy.pptx
Basics of Anatomy- Language of Anatomy.pptxBasics of Anatomy- Language of Anatomy.pptx
Basics of Anatomy- Language of Anatomy.pptx
 
Gurgaon iffco chowk 🔝 Call Girls Service 🔝 ( 8264348440 ) unlimited hard sex ...
Gurgaon iffco chowk 🔝 Call Girls Service 🔝 ( 8264348440 ) unlimited hard sex ...Gurgaon iffco chowk 🔝 Call Girls Service 🔝 ( 8264348440 ) unlimited hard sex ...
Gurgaon iffco chowk 🔝 Call Girls Service 🔝 ( 8264348440 ) unlimited hard sex ...
 
Model Call Girl in Subhash Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Subhash Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Subhash Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Subhash Nagar Delhi reach out to us at 🔝9953056974🔝
 

AI for Precision Medicine (Pragmatic preclinical data science)

  • 1. AI FOR PRECISION MEDICINE PRAGMATIC PRECLINICAL DATA SCIENCE Paul Agapow <p.agapow@imperial.ac.uk>
 Data Science Institute, Imperial College London Pharma AI & IoT (London, July 2018)
  • 2. MLMH2018 - KDD Workshop on Machine Learning for Medicine and Healthcare August 20, 2018, London, UK Topics of interest: •  Data Standards for Translational Medicine Informatics •  Analysis of large scale electronic health records or patient- generated health data records •  Visualisation of complex and dynamic biomedical networks •  Disease Subtype Discovery for Precision Medicine •  Interpretable Machine Learning for biomedicine and healthcare •  Deep learning for biomedicine Important Dates •  Submission deadline: May 25, 2018 •  Notification accept: June 8, 2018 •  Workshop date: August 8, 2018 Meet our Panel! T. Roy (Ph.D), University of Southampton, UK A. Teredesai (PhD), University of Washington, Tacoma S. Wagers (MD), CEO/Founder BioSci Consulting, Belgium Join us during the KDD Health Day! Win IBM $1,000 travel grant for best selected student paper! Follow us! https://mlmhworkshop.github.io/mlmh-2018 Twitter: Contact us: mlmhworkshop@googlegroups.com Organizers: M. Saqi, Imperial College London, UK P. Chakraborty, IBM Research, USA I. Balaur, EISBM, Lyon, France P. Agapow, Imperial College London, UK S. Wagers, BioSci Consulting, Belgium P.Y. S. Hsueh, IBM Research, USA F. Rahmanian, Geneia, USA M.A. Ahmad, Kensci Inc. and University of Washington - Tacoma, USA
  • 3. BACKGROUND & DISCLOSURE ➤ Data Science Institute (Imperial College London) ➤ Novel & advanced computation over large rich biomedical datasets for translational research & precision medicine ➤ Patient subtype discovery & mechanistic insight ➤ Scientific Advisor to PangaeaData.ai
  • 6. “Nice training set. Where’s your data? - An Analyst
  • 7. BIG BIOMEDICAL DATA USUALLY ISN’T ➤ Average trial size on ClinicalTrials.gov < 100 ➤ Average #samples per GEO dataset < 100 ➤ Average GWAS cohort size ~9000 (median ~2500) ➤ 1,064 ICU admissions for flu in UK 2016/2017 season ➤ Curse of dimensionality ➤ Deep learning requires “thousands” of samples for training (at least p2?) ➤ GWAS needs 3K+ for large effects, 10K or more for small effects … ➤ Sub-populations & rare diseases will be smaller VS
  • 8. MAKE BIGGER DATASETS ➤ “Allow” reuse & combining not “build” ➤ FAIR ➤ Use standards like CDISC, HPO … ➤ eTRIKS ➤ Data intensive translational research ➤ Sharing data (standards, starter kit) ➤ Data catalog of ~70 studies ➤ EHDN / EHDEN ➤ European Health Data and Evidence Network ➤ Harmonised model for accessing health data
  • 9. WE NEED MORE ETL ➤ Too damn slow and expensive ➤ Tools are poor ➤ Humans are inconsistent ➤ Standards are complex ➤ Harmonisation by ML is the only answer ➤ Learn from data examples ➤ Corrected by humans ➤ “Discover” schema if need be 1 2 3 4 1 2 3 4 Text data Tabular data § Frequent Pattern Mining-Growth Algorithms to determine schema association rules § Word2Vec to condense information of text sequence and context § Graph-Theoretical Algorithms to determine logical sequences, followers, associations, matchings § Decision Trees, Neural Nets and Support Vector Machines for training the model § Custom Algorithms to prepare data and check data quality Pre-classified data and master data mappingsData extractor Data extractor From PangaeaData.AI
  • 10. EXAMPLE: U-BIOPRED ➤ Unbiased BIOmarkers in PREDiction of respiratory disease outcomes ➤ 900+ patients, 16 clinical centres + other studies combined via standards ➤ Outputs: ➤ Analyses largely on small subsets (~100) ➤ Subtyping of asthmatics ➤ 40+ academic publications
  • 12.
  • 13. THE REALITY OF DEEP LEARNING ➤ Deep learning is still in progress ➤ Usually insufficient (good labelled) data ➤ Interpretability issues ➤ Legal & ethical issues, federated analysis ➤ Tells you what you’ve told it ➤ Bias towards images ➤ For now …
  • 14. DEEP LEARNING WITH LESS DATA ➤ Pre-training (data without labels) ➤ Initial training with mediocre data ➤ Adapt ➤ Transfer learning (labels / output changes) ➤ Domain adaptation (data / input changes) ➤ Data augmentation ➤ Interpretability coming slowly (LIME) Dielman 2015
  • 15. “80% of the time, you can get 80% of the way with a simple decision tree. - Doug Mcilwraith (paraphrased)
  • 16. EXAMPLE: TEXT CLASSIFICATION FOR SYSTEMATIC REVIEWS ➤ Aim: find similar or related publications within corpus ➤ Actual aim: find which which method of text classification is “best” (Validation) ➤ Data: 15 Drug Control Reviews & Neuropathic Pain dataset ➤ Classify with random forest, naive bayes, SVM & CNNs Conclusion Dataset WSS Classifier Dataset WSS Classifier ACE Inhibitors 0.26 SVM NSAIDS 0.14 SVM ADHD 0.35 MNB Opioids 0.23 SVM Antihistamines 0.19 MNB Oral Hypoglycemics 0.21 SVM Atypical Antipsychotics 0.12 SVM PPI 0.17 SVM Beta Blockers 0.13 SVM Skeletal Muscle Relaxants 0.21 SVM CCB 0.21 SVM Statins 0.19 SVM Estrogen 0.25 SVM Triptans 0.22 SVM Neuropathic Pain 0.61 CNN Urinary Incontinence 0.25 SVM
  • 18. OMICS IS ONLY ONE TYPE OF INFORMATION ➤ We don’t have enough data ➤ Methods may not work ➤ Results may be artefactual ➤ But there is other information … EHR interactome devices RWE social media chemistry evolution / phylogeny etc.
  • 19. MULTI-OMICS OR INTEGRATED ANALYSIS ➤ Why? ➤ One way to get more data ➤ Statistical power ➤ Multiple defects required to drive endogenous disease ➤ Multiple “views” on condition ➤ How? ➤ Cluster / network individual data layers ➤ Fuse together for consensus Nemutlu 2012
  • 20. EXAMPLE: ASTHMA ENDOTYPING ➤ Asthma is highly heterogenous ➤ Symptoms ➤ Response to interventions ➤ Multiple mechanisms ➤ 3 or 4 or 7 clusters … ➤ Carefully curated data from U- BIOPRED (~100) ➤ Multi-method, multi-data analysis Wiki Commons
  • 21. ASTHMA ENDOTYPES ➤ Use a variety of clustering approaches over asthma cohort ‘omics data (bayesian, spectral, iCluster) ➤ Use multi-omics approaches (SNF, NNMF) ➤ Assess agreement / coherence ➤ Validate in pathways, in other cohorts and in other data types
  • 22. CONCLUSIONS ➤ Big biomedical data is often not big, but we can make it bigger ➤ Sometimes [Big | Deep | Advanced] approaches are useful, sometimes not: choose wisely ➤ Contextual information is vital, both for primary analysis and for validation
  • 23. THANKS ➤ Data Science Institute, ICL ➤ Fayzal Ghantiwala (Bloomberg) ➤ Nazanin Zounemat Kermani (ICL) ➤ Mansoor Saqi (ICL / KCL) ➤ Romain Guédon (Nantes) ➤ Yike Guo (ICL) ➤ eTRIKS consortium ➤ U-BIOPRED consortium