SlideShare une entreprise Scribd logo
1  sur  42
21 June, 2019
Big Data Mining and AI for Drug
Repurposing
Pistoia Alliance Centre of Excellence
for AI in Life Sciences and Elsevier
Datathon Report
Panelists: Aleksandar Poleksic, Professor, University of Northern Iowa
Bruce Aronow, Co-director of the Computational Medicine Center at
Cincinnati Children’s Hospital Medical Center
Finlay Maclean, Elsevier, London UK
Jabe Wilson of Elsevier
Moderator: Vladimir Makarov
This webinar is being recorded
©PistoiaAlliance
Introduction to Today’s Speakers
Aleksandar Poleksic
Professor
University of Northern
Iowa
Finlay Maclean
Elsevier, London UK
Bruce Aronow
Co-director
Computational Medicine Center
at Cincinnati Children’s Hospital
Medical Center
Jabe Wilson
Elsevier
Predictive Analytics for Drug Repurposing
21.06.2019
• Collaboration across Pharma, Academic and Non-Profit
• Data from both Elsevier and 3rd Party sources
• Machine Learning and other Analytics methods used
to predict Drugs to be repurposed for disease treatment
• Results validated by leading experts in the disease
(Chronic Pancreatitis)
• Our partner Mission-Cure is planning to take drugs to
patient trials by January 2020
• “The datathon exceeded our expectations,
producing 5 repurposing candidates to address
multiple chronic pancreatitis targets” Megan Golden
CEO Mission Cure
Predictive Analytics for Drug Repurposing
1. March - July 2019: Finish identifying the most promising candidates, identify which ones need
additional preclinical work
3. July 24-26, 2019: PancreasFest meeting in Pittsburgh: coordinate preclinical work and
plan clinical trials with PI's
2. July - December 2019: Fund and conduct preclinical work; plan pilots/trials for safest, most
promising candidates
4. January - June 2020: Conduct small open-label pilots with safest, most promising candidates
and informed patient volunteers
5. July 2020 - June 2022: Conduct repurposing clinical trials using efficient trial designs (e.g.
aggregated n of 1 trials); develop master trial protocol
6. July 2022 - June 2024: Implement master trial to test multiple promising therapeutic candidates
alone and in combination
7. July 2024 - June 2027: Continue master trial until therapies identified
Predictive Analytics for Drug Repurposing
Thank you
Disease-specific
concepts
Disease
entity
Phenotypes
of Disease
Disease
Causes/Factors
Pathway-Network
Target Associations
Disease
names
Genetics; Infectious;
Immuno/ Allergic;
Environmental; Drugs
Gene functions/ annotations, gene
interactions; regulators of genes,
pathways, cells, tissues; phenotypes
ClinVar; ClinGen; MP;
Drug-associations: Adverse Events,
other indications, eg AERsMine;
https://research.cchmc.org/aers/
Information
Sources
Pathologic
attributes and
associations
• OMIM
• HPO Human Phenotype
Ontology
• ICD
• UMLS
• Wikipedia
Effects and Causal
Relationships
Modeling a Disease: Identifying Attributes, Causes, Effects, Modifiers, and Treatments
Human Cell Atlas
Cell type transcriptome
Tissue cell map
http://toppcell.cchmc.org
Elements of
any Disease
21.06.2019
Disease
Elements
Toppcell database: using single cell gene expression data to
understand gene networks responsible for organ health and
disease
Single cell dataset(s)
Learned cell
annotation
User-
defined
Genelist
Biological
pathway-
based
Genelist
Cell type
specific
Genelist
± ±
Machine Learning-based Analysis
User-defined
cell
annotation
Normalization;
Clustering;
Differential analysis;
…
Processing
Interactive heatmaps
Re-
analysis
Searching
Clustering
Searching
Grouping
Enrichment
Eric Bardes
±
ToppCell: Leveraging the Human Cell Atlas
21.06.2019
Data Mining by
Organ/Cell Type
Search/
Cluster/
Enrich/Net
Derive models for
° Differentiation
° Organogenesis
° Pathways / Networks
° Cell-cell Interactions
° Physiology
° Pathology
Pancreas tissue  individual single cells
exocrine endocrine
acinar ductal St alpha delta PP beta
marker genes in
pancreatic cell
types
Portal Views For Data Mining/Systems Biology-Driven Analyses
(1) find/select cell clusters/gene modules and anatomical contexts
(2) carry out enrichment analyses and machine learning prioritization of genes, pathways, interactions
(3) assemble/save/share/export integrated systems biological network models
Tissue/Sample-Associated Cell Population Gene
Modules: cell type-centered signatures allow
for the analysis of cell class and subclass
similarities and differences.
Use Case: compare/combine alveolar epithelial cell subtypes
genesignatures--perregion,per
stage,percelltype/subtype
|ß Single Cells (1,004 shown) à | Systems Biology via the LungMAP Portalè Note the profound functional association differences
between AT1 and AT2 subtype signatures. However, it is precisely through the combinations of their
specialized biological functions that alveolar structure and physiological function achieves highly
efficient air – blood gas exchange. This illuminates the utility of providing users with subtype and
stage-specific gene modules for multimodule and multimodal/technology-based biological network
analyses.
Single Cell Atlas(es) Per Protocol, source, cell-types, subtypes, and developmental stages
(example mouse Fluidigm- LungMAP all distal, all stages, by cell type)
Anatomic regions; Cell types; subtypes; develop stages
|ß16,400Genes(redundancyok)à|
User selects
cell types/
gene modules
for biological
network
analyses
AT1
cell junctions
cell projections
cytoskeleton
angiogenesis
vascular morphogen
AT2
surfactant biology
lipid biosynth
vesicles
lamellar body
secretion
https://research.cchmc.org/aers/
https://research.cchmc.org/aers/
Drugs with high risk /
elevated safety signal
for pancreatitis
Drugs Associated
with (Unexpectedly)
HIGH risk of
Pancreatitis
Drugs Associated with
(Unexpectedly) LOW risk
of Pancreatitis
Using Heterogenous Ensemble Classifiers For Drug
Target Interaction Prediction
20
Finlay MacLean
The Problem
21
- Search space is huge
- Chemogenomic
- Pharmacologic
- Known information is sparse and heavily
biased
- Only positive measurements
- Possible data sources huge
- Multidomain multilevel information
Yella J, Yaddanapudi S, Wang Y, Jegga A. Changing trends in computational drug repositioning. Pharmaceuticals. 2018 Jun;11(2):57.
The Data
22
- 765 disease-associated targets
- 119401 positive interactions
- 203 targets with known bioactivities
- 44161 unique substances
- 2766 possible repurposable drugs
- 15 main genetic drivers
Accumulative bioactivies for disease-associated targets
No targets (accumulative)
Binding affinity for disease-implicated substances
Kernels and Similarity Metrics
23
Substances
- Morgan Fingerprint radius 3 to encode substructures
- Tanimoto Distance to determine substructure similarity
Targets
- Local Smith Waterman Alignment
Harish Kandan, Understanding the kernel trick. https://towardsdatascience.com/understanding-the-kernel-trick-e0bc6112ef78
Kernel Explosion
24
- Apply Kronecker multiplication to drug and target
kernels
- Train Support Vector Machine on Kronecker kernel.
- Training kernel:
203 targets with known bioactivities
44161 bioactive substances
41 209 targets x 1 950 193 921 substances = 80 trillion!
~ 500TB!
- I wish I had a cluster that big..
Ensemble Learning
25
- Train multiple models!
- 1. Each takes subset of data
- 2. Each self-evaluates
- 3. Evaluate meta-learner
- 4. Feed genetic driver of CP
- 5. Predict on repurposable drugs
- 6. Weighted average of results
- Reach optimization limit around 0.94
AUROC (for kernels of 30 substances and
30 targets).
- Largest kernels still only around 1000.
Kronecker-RLS
26
Pahikkala T, Airola A. RLScore: regularized least-squares learners. The Journal of Machine Learning Research. 2016 Jan 1;17(1):7803-7.Nascimento AC, Prudêncio
RB, Costa IG. A multiple kernel learning algorithm for drug-target interaction prediction. BMC bioinformatics. 2016 Dec;17(1):46.
Kashima H, Oyama S, Yamanishi Y, Tsuda K. On pairwise kernels: an efficient alternative and generalization analysis. Adv Data Min Knowl Disc. 2009; 5476:1030–7.
- Take advantage of inherent symmetry
- Eigendecompose similarity kernels
- Take advantage of kernel ‘trick’
- Employ regularised least squares
- Feed into ensemble!
- Homogenous bagging ensemble performed best
Final ensemble:
30 models, each:
- Trained and optimized on 500 substances
and 200 most bioactive targets
- Evaluated (model-level)
- Evaluated (ensemble-level)
- Predict!
Improvements
27
 Sparse data
 CGKronRLS (Semi-superversied learning)
 Other pairwise relationships can be used
 KronRLS-MKL (Multiple kernel learning)
 Use of Guassian Interaction Profiles
 Sequential model execution and storage
 Boosting instead of bagging (sample level optimization)
 Making numpy/BLAS work on distributed GPUs
 Employ a meta-learning not voting classifier
Tapio Pahikkala. Fast gradient computation for learning with tensor product kernels and sparse training labels. Structural, Syntactic, and Statistical Pattern
Recognition (S+SSPR). volume 8621 of Lecture Notes in Computer Science, pages 123–132. 2014.
Nascimento AC, Prudêncio RB, Costa IG. A multiple kernel learning algorithm for drug-target interaction prediction. BMC bioinformatics. 2016 Dec;17(1):46.
Pahikkala T, Airola A. RLScore: regularized least-squares learners. The Journal of Machine Learning Research. 2016 Jan 1;17(1):7803-7.Nascimento AC,
Prudêncio RB, Costa IG. A multiple kernel learning algorithm for drug-target interaction prediction. BMC bioinformatics. 2016 Dec;17(1):46.
Kashima H, Oyama S, Yamanishi Y, Tsuda K. On pairwise kernels: an efficient alternative and generalization analysis. Adv Data Min Knowl Disc. 2009;
5476:1030–7.
Using “compressed sensing” to
support drug repurposing for
chronic pancreatitis
Prof. Aleksandar Poleksić
Department of Computer Science
University of Northern Iowa
Compressed sensing for ADR prediction
• Idea: Factor 𝑅 𝑚×𝑛 into the product of
two lower dimensional matrices
𝑅 = 𝐹𝐺′
Logistic matrix
factorization
Idea: Factor 𝑅 𝑚×𝑛 into the product of
two lower dimensional matrices
𝑅 = 𝐹𝐺′
Loss function:
𝑖,𝑗
𝑤𝑖,𝑗{ln(1 + 𝑒 𝑓𝑖 𝑔 𝑗
′
) − (𝑟𝑖,𝑗+𝑞𝑖,𝑗)𝑓𝑖 𝑔𝑗
′
} +
𝜆 𝐹 𝐹 2
2
+ 𝜆 𝐺 𝐺 2
2
+
𝜆 𝑀 𝑖,𝑗 𝑚𝑖,𝑗 𝐹 𝑖, : − 𝐹 𝑗, : 2
2
+
𝜆 𝑁
𝑖,𝑗
𝑛𝑖,𝑗 𝐺 𝑖, : − 𝐺 𝑗, : 2
2
M,N – similarity matrices
Q – impute matrix
W- weight matrix
lambdas – tunable parameters
P – output probabilities
Optimization
𝜕/𝜕𝐹 = 𝑊⨀ 𝑃 − 𝑅 + 𝑄 𝐺 + 2𝜆 𝑟 𝐹 + 2𝜆 𝑀(𝐷 𝑀 − 𝑀)𝐹
𝜕/𝜕𝐺 = {𝑊 𝑇
⨀ 𝑃 𝑇
− 𝑅 𝑇
+ 𝑄 𝑇
}𝐹 + 2𝜆 𝑟 𝐺 + 2𝜆 𝑁(𝐷 𝑁 − 𝑁)𝐺
𝑖,𝑗
𝑤𝑖,𝑗 ln 1 + 𝑒 𝑓 𝑖 𝑔 𝑗
′
− 𝑟𝑖,𝑗 + 𝑞𝑖,𝑗 𝑓𝑖 𝑔𝑗
′
+ 𝜆 𝐹 𝐹 2
2
+ 𝜆 𝐺 𝐺 2
2
+ 𝜆 𝑀 𝑡𝑟 𝐹′ 𝐷 𝑀 − 𝑀 𝐹 + 𝜆 𝑁 𝑡𝑟 𝐺′ 𝐷 𝑁 − 𝑁 𝐺
Loss function:
Partial derivatives:
Minimization algorithm: Gradient descent
SIDER benchmark
Q: Is a new chemical likely to cause
hepatotoxicity?
Q: Is a new chemical likely to cause a serious
rare side effect?
ADR prediction for candidate CP drugs
LACOSAMIDE ADRs
CC(=O)NC(COC)C(=O)NCC1=CC=CC=C1
ADR_Name(CUI) Prob
Nausea(C0027497) 0.99
Vomiting(C0042963) 0.97
Asthenia(C0015672) 0.916
Dizziness(C0012833) 0.912
Headache(C0018681) 0.908
Dry_mouth(C0043352) 0.881
Diarrhea(C0011991) 0.874
Dermatitis(C0015230) 0.818
Constipation(C0009806) 0.789
Somnolence(C2830004) 0.738
Tremor(C0040822) 0.733
Lacosamide ADR profile
http://gpubox.cs.uni.edu
Candidate CP drugs – network prediction
Compound Disease
treats
resembles
Drug Prob Z-score
Hyoscyamine 0.35 6.61
Irinotecan 0.26 4.94
Varenicline 0.20 3.74
Octreotide 0.17 3.26
Propantheline 0.17 3.19
Citalopram 0.17 3.18
Acamprosate 0.14 2.61
Disulfiram 0.13 2.44
Epirubicin 0.12 2.31
Tamoxifen 0.11 2.08
Doxorubicin 0.11 2.05
Naltrexone 0.11 2.03
Paclitaxel 0.10 1.90
Erlotinib 0.10 1.85
Topotecan 0.10 1.83
Sorafenib 0.09 1.64
Proguanil 0.08 1.42
Metformin 0.07 1.30
Telbivudine 0.06 1.13
Orlistat 0.06 1.06
Apply compressed sensing on drug-
disease network*:
• 1552 compounds
• 137 diseases
• 755 known treatments
* Himmelstein, D.S. & Baranzini, S.E. PLoS Comput Biol 11, e1004259 (2015).
Lacosamide – network prediction
Gene
Compound Disease
treats
palliates
resemblesresembles
Lacosamide-binds-gene-associates pancreatitis:
CFTR (0.0806)
ALB (0.0522)
PTGS2 (0.0224)
MPO (0.0192)
CYP1A1 (0.0151)
ACE (0.0088)
ABCB1 (0.0086)
FDX1 (0.0065)
CXCL8 (0.0063)
TNF (0.0039)
AHR (0.0035)
ADRB2 (0.0028)
CA8 (0.0028)
SLC12A1 (0.0027)
BCHE (0.0017)
ADRB1 (0.0015)
Collaborators:
Prof. Lei Xie, CUNY Graduate Center
References:
1. Poleksic, A., & Xie, L. (2018). Predicting serious rare adverse reactions of novel
chemicals. Bioinformatics, 34(16), 2835-2842.
2. Lim, H., Gray, P., Xie, L., & Poleksic, A. (2016). Improved genome-scale multi-target
virtual screening via a novel collaborative filtering approach to cold-start problem. Scientific
reports, 6, 38860.
3. Poleksic, A., & Xie, L. (2019). Database of Adverse Events Associated with Drugs and
Drug Combinations, in review.
Poll Question:
In what other medical area should we run
the next pre-competitive research
exercise?
A. Oncology
B. Heart Disease
C. Diabetes
D. Obesity
E. Some other unmet need (send
©PistoiaAlliance
Audience Q&A
Please use the Question function in GoToWebinar
©PistoiaAlliance
Upcoming Webinars
1. Date TBD – July 2019: User Experience (UX) Design
for AI
2. Date TBD: Virtual Roundtable: Innovative Pathways
through the FDA & EMEA (with the Westchester
Biotech Project)
3. Planning: Ethics and AI
Please suggest other topics
info@pistoiaalliance.org @pistoiaalliance www.pistoiaalliance.org
Thank You

Contenu connexe

Tendances

Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma
Ankur Khanna
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
Helena Deus
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
Chien-Wei Lin
 

Tendances (20)

Drug discovery using ai
Drug discovery using aiDrug discovery using ai
Drug discovery using ai
 
Digital webinar master deck final
Digital webinar master deck finalDigital webinar master deck final
Digital webinar master deck final
 
Overpromise of AI in Drug Discovery
Overpromise of AI in Drug DiscoveryOverpromise of AI in Drug Discovery
Overpromise of AI in Drug Discovery
 
Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma
 
Computational prediction of antimicrobial peptide activity
Computational prediction of antimicrobial peptide activityComputational prediction of antimicrobial peptide activity
Computational prediction of antimicrobial peptide activity
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar
 
Pharma data analytics
Pharma data analyticsPharma data analytics
Pharma data analytics
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
 
Pine.Bio slide deck - Idea Village CAPITALx (New Orleans Entrepreneur Week 2017)
Pine.Bio slide deck - Idea Village CAPITALx (New Orleans Entrepreneur Week 2017)Pine.Bio slide deck - Idea Village CAPITALx (New Orleans Entrepreneur Week 2017)
Pine.Bio slide deck - Idea Village CAPITALx (New Orleans Entrepreneur Week 2017)
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David Amar
 
An Introduction to Biology with Computers
An Introduction to Biology with ComputersAn Introduction to Biology with Computers
An Introduction to Biology with Computers
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how
 
Omics Logic Genomics Program
Omics Logic Genomics ProgramOmics Logic Genomics Program
Omics Logic Genomics Program
 
Complex Systems Biology Informed Data Analysis and Machine Learning
Complex Systems Biology Informed Data Analysis and Machine LearningComplex Systems Biology Informed Data Analysis and Machine Learning
Complex Systems Biology Informed Data Analysis and Machine Learning
 
Genomics2 Phenomics Complete
Genomics2 Phenomics CompleteGenomics2 Phenomics Complete
Genomics2 Phenomics Complete
 
Multi-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application DomainsMulti-Omics Bioinformatics across Application Domains
Multi-Omics Bioinformatics across Application Domains
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
 
Bioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big DataBioinformatics in the Era of Open Science and Big Data
Bioinformatics in the Era of Open Science and Big Data
 

Similaire à Pistoia Alliance-Elsevier Datathon

The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...
Levi Waldron
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
Martín Arrieta
 

Similaire à Pistoia Alliance-Elsevier Datathon (20)

Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria López
 
Qiu_CV_Feb12_2017
Qiu_CV_Feb12_2017Qiu_CV_Feb12_2017
Qiu_CV_Feb12_2017
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision Medicine
 
Role of bioinformatics of drug designing
Role of bioinformatics of drug designingRole of bioinformatics of drug designing
Role of bioinformatics of drug designing
 
Math, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical ResearchMath, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical Research
 
The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...
 
NTU-2019
NTU-2019NTU-2019
NTU-2019
 
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
 
Amia tb-review-15
Amia tb-review-15Amia tb-review-15
Amia tb-review-15
 
JALANov2000
JALANov2000JALANov2000
JALANov2000
 
ALS postdoc position 2017
ALS postdoc position 2017ALS postdoc position 2017
ALS postdoc position 2017
 
PadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxPadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptx
 
Dalton
DaltonDalton
Dalton
 
Dalton presentation
Dalton presentationDalton presentation
Dalton presentation
 
Friend harvard 2013-01-30
Friend harvard 2013-01-30Friend harvard 2013-01-30
Friend harvard 2013-01-30
 
Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...Quantifying the content of biomedical semantic resources as a core for drug d...
Quantifying the content of biomedical semantic resources as a core for drug d...
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Bioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast CancerBioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast Cancer
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
 

Plus de Pistoia Alliance

Plus de Pistoia Alliance (20)

Fairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matricesFairification experience clarifying the semantics of data matrices
Fairification experience clarifying the semantics of data matrices
 
Heartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirtiHeartificial intelligence - claudio-mirti
Heartificial intelligence - claudio-mirti
 
Fair by design
Fair by designFair by design
Fair by design
 
Knowledge graphs ilaria maresi the hyve 23apr2020
Knowledge graphs   ilaria maresi the hyve 23apr2020Knowledge graphs   ilaria maresi the hyve 23apr2020
Knowledge graphs ilaria maresi the hyve 23apr2020
 
Data market evolution, a future shaped by FAIR
Data market evolution, a future shaped by FAIRData market evolution, a future shaped by FAIR
Data market evolution, a future shaped by FAIR
 
Open interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBIOpen interoperability standards, tools and services at EMBL-EBI
Open interoperability standards, tools and services at EMBL-EBI
 
Fair webinar, Ted slater: progress towards commercial fair data products and ...
Fair webinar, Ted slater: progress towards commercial fair data products and ...Fair webinar, Ted slater: progress towards commercial fair data products and ...
Fair webinar, Ted slater: progress towards commercial fair data products and ...
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
Implementing Blockchain applications in healthcare
Implementing Blockchain applications in healthcareImplementing Blockchain applications in healthcare
Implementing Blockchain applications in healthcare
 
Building trust and accountability - the role User Experience design can play ...
Building trust and accountability - the role User Experience design can play ...Building trust and accountability - the role User Experience design can play ...
Building trust and accountability - the role User Experience design can play ...
 
PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences
 
AI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoEAI & ML in Drug Design: Pistoia Alliance CoE
AI & ML in Drug Design: Pistoia Alliance CoE
 
Blockchain and IOT and the GxP Lab Slides
Blockchain and IOT and the GxP Lab SlidesBlockchain and IOT and the GxP Lab Slides
Blockchain and IOT and the GxP Lab Slides
 
Knowledge Graphs for Pharma PA Slideshow
Knowledge Graphs for Pharma PA SlideshowKnowledge Graphs for Pharma PA Slideshow
Knowledge Graphs for Pharma PA Slideshow
 
Data quality supporting AI in Life Sciences webinar 10 dec 2018
Data quality supporting AI in Life Sciences webinar 10 dec 2018Data quality supporting AI in Life Sciences webinar 10 dec 2018
Data quality supporting AI in Life Sciences webinar 10 dec 2018
 
Pistoia alliance harmonizing fair data catalog approaches webinar
Pistoia alliance harmonizing fair data catalog approaches webinarPistoia alliance harmonizing fair data catalog approaches webinar
Pistoia alliance harmonizing fair data catalog approaches webinar
 
Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018
Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018
Joint Pistoia Alliance & PRISME AI in pharma webinar 18 Oct 2018
 
Pistoia Alliance datathon for drug repurposing for rare diseases
Pistoia Alliance datathon for drug repurposing for rare diseasesPistoia Alliance datathon for drug repurposing for rare diseases
Pistoia Alliance datathon for drug repurposing for rare diseases
 
blockchain-introduction-pistoia-alliance
blockchain-introduction-pistoia-allianceblockchain-introduction-pistoia-alliance
blockchain-introduction-pistoia-alliance
 
Pistoia Alliance Demystifying AI & ML part 2
Pistoia Alliance Demystifying AI & ML part 2Pistoia Alliance Demystifying AI & ML part 2
Pistoia Alliance Demystifying AI & ML part 2
 

Dernier

Russian Escorts Girls Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
Russian Escorts Girls  Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls DelhiRussian Escorts Girls  Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
Russian Escorts Girls Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
AlinaDevecerski
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Dipal Arora
 
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
perfect solution
 
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Dipal Arora
 

Dernier (20)

Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Ooty Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
Call Girls Visakhapatnam Just Call 9907093804 Top Class Call Girl Service Ava...
 
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
(Low Rate RASHMI ) Rate Of Call Girls Jaipur ❣ 8445551418 ❣ Elite Models & Ce...
 
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
Premium Bangalore Call Girls Jigani Dail 6378878445 Escort Service For Hot Ma...
 
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
 
Russian Escorts Girls Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
Russian Escorts Girls  Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls DelhiRussian Escorts Girls  Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
Russian Escorts Girls Nehru Place ZINATHI 🔝9711199012 ☪ 24/7 Call Girls Delhi
 
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
All Time Service Available Call Girls Marine Drive 📳 9820252231 For 18+ VIP C...
 
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Kochi Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Bhubaneswar Just Call 9907093804 Top Class Call Girl Service Avail...
 
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
College Call Girls in Haridwar 9667172968 Short 4000 Night 10000 Best call gi...
 
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
Best Rate (Patna ) Call Girls Patna ⟟ 8617370543 ⟟ High Class Call Girl In 5 ...
 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
 
Call Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Jabalpur Just Call 8250077686 Top Class Call Girl Service Available
 
(Rocky) Jaipur Call Girl - 09521753030 Escorts Service 50% Off with Cash ON D...
(Rocky) Jaipur Call Girl - 09521753030 Escorts Service 50% Off with Cash ON D...(Rocky) Jaipur Call Girl - 09521753030 Escorts Service 50% Off with Cash ON D...
(Rocky) Jaipur Call Girl - 09521753030 Escorts Service 50% Off with Cash ON D...
 
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
 
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...Top Rated Bangalore Call Girls Richmond Circle ⟟  9332606886 ⟟ Call Me For Ge...
Top Rated Bangalore Call Girls Richmond Circle ⟟ 9332606886 ⟟ Call Me For Ge...
 
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Aurangabad Just Call 8250077686 Top Class Call Girl Service Available
 
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
 
Call Girls Gwalior Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Gwalior Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Gwalior Just Call 9907093804 Top Class Call Girl Service Available
 

Pistoia Alliance-Elsevier Datathon

  • 1. 21 June, 2019 Big Data Mining and AI for Drug Repurposing Pistoia Alliance Centre of Excellence for AI in Life Sciences and Elsevier Datathon Report Panelists: Aleksandar Poleksic, Professor, University of Northern Iowa Bruce Aronow, Co-director of the Computational Medicine Center at Cincinnati Children’s Hospital Medical Center Finlay Maclean, Elsevier, London UK Jabe Wilson of Elsevier Moderator: Vladimir Makarov
  • 2. This webinar is being recorded
  • 3. ©PistoiaAlliance Introduction to Today’s Speakers Aleksandar Poleksic Professor University of Northern Iowa Finlay Maclean Elsevier, London UK Bruce Aronow Co-director Computational Medicine Center at Cincinnati Children’s Hospital Medical Center Jabe Wilson Elsevier
  • 4. Predictive Analytics for Drug Repurposing
  • 5. 21.06.2019 • Collaboration across Pharma, Academic and Non-Profit • Data from both Elsevier and 3rd Party sources • Machine Learning and other Analytics methods used to predict Drugs to be repurposed for disease treatment • Results validated by leading experts in the disease (Chronic Pancreatitis) • Our partner Mission-Cure is planning to take drugs to patient trials by January 2020 • “The datathon exceeded our expectations, producing 5 repurposing candidates to address multiple chronic pancreatitis targets” Megan Golden CEO Mission Cure Predictive Analytics for Drug Repurposing
  • 6. 1. March - July 2019: Finish identifying the most promising candidates, identify which ones need additional preclinical work 3. July 24-26, 2019: PancreasFest meeting in Pittsburgh: coordinate preclinical work and plan clinical trials with PI's 2. July - December 2019: Fund and conduct preclinical work; plan pilots/trials for safest, most promising candidates 4. January - June 2020: Conduct small open-label pilots with safest, most promising candidates and informed patient volunteers 5. July 2020 - June 2022: Conduct repurposing clinical trials using efficient trial designs (e.g. aggregated n of 1 trials); develop master trial protocol 6. July 2022 - June 2024: Implement master trial to test multiple promising therapeutic candidates alone and in combination 7. July 2024 - June 2027: Continue master trial until therapies identified Predictive Analytics for Drug Repurposing
  • 8. Disease-specific concepts Disease entity Phenotypes of Disease Disease Causes/Factors Pathway-Network Target Associations Disease names Genetics; Infectious; Immuno/ Allergic; Environmental; Drugs Gene functions/ annotations, gene interactions; regulators of genes, pathways, cells, tissues; phenotypes ClinVar; ClinGen; MP; Drug-associations: Adverse Events, other indications, eg AERsMine; https://research.cchmc.org/aers/ Information Sources Pathologic attributes and associations • OMIM • HPO Human Phenotype Ontology • ICD • UMLS • Wikipedia Effects and Causal Relationships Modeling a Disease: Identifying Attributes, Causes, Effects, Modifiers, and Treatments Human Cell Atlas Cell type transcriptome Tissue cell map http://toppcell.cchmc.org Elements of any Disease
  • 10. Toppcell database: using single cell gene expression data to understand gene networks responsible for organ health and disease Single cell dataset(s) Learned cell annotation User- defined Genelist Biological pathway- based Genelist Cell type specific Genelist ± ± Machine Learning-based Analysis User-defined cell annotation Normalization; Clustering; Differential analysis; … Processing Interactive heatmaps Re- analysis Searching Clustering Searching Grouping Enrichment Eric Bardes ±
  • 11. ToppCell: Leveraging the Human Cell Atlas 21.06.2019 Data Mining by Organ/Cell Type Search/ Cluster/ Enrich/Net Derive models for ° Differentiation ° Organogenesis ° Pathways / Networks ° Cell-cell Interactions ° Physiology ° Pathology Pancreas tissue  individual single cells
  • 12. exocrine endocrine acinar ductal St alpha delta PP beta marker genes in pancreatic cell types
  • 13. Portal Views For Data Mining/Systems Biology-Driven Analyses (1) find/select cell clusters/gene modules and anatomical contexts (2) carry out enrichment analyses and machine learning prioritization of genes, pathways, interactions (3) assemble/save/share/export integrated systems biological network models Tissue/Sample-Associated Cell Population Gene Modules: cell type-centered signatures allow for the analysis of cell class and subclass similarities and differences. Use Case: compare/combine alveolar epithelial cell subtypes genesignatures--perregion,per stage,percelltype/subtype |ß Single Cells (1,004 shown) à | Systems Biology via the LungMAP Portalè Note the profound functional association differences between AT1 and AT2 subtype signatures. However, it is precisely through the combinations of their specialized biological functions that alveolar structure and physiological function achieves highly efficient air – blood gas exchange. This illuminates the utility of providing users with subtype and stage-specific gene modules for multimodule and multimodal/technology-based biological network analyses. Single Cell Atlas(es) Per Protocol, source, cell-types, subtypes, and developmental stages (example mouse Fluidigm- LungMAP all distal, all stages, by cell type) Anatomic regions; Cell types; subtypes; develop stages |ß16,400Genes(redundancyok)à| User selects cell types/ gene modules for biological network analyses AT1 cell junctions cell projections cytoskeleton angiogenesis vascular morphogen AT2 surfactant biology lipid biosynth vesicles lamellar body secretion
  • 15.
  • 16.
  • 17. https://research.cchmc.org/aers/ Drugs with high risk / elevated safety signal for pancreatitis
  • 19. Drugs Associated with (Unexpectedly) LOW risk of Pancreatitis
  • 20. Using Heterogenous Ensemble Classifiers For Drug Target Interaction Prediction 20 Finlay MacLean
  • 21. The Problem 21 - Search space is huge - Chemogenomic - Pharmacologic - Known information is sparse and heavily biased - Only positive measurements - Possible data sources huge - Multidomain multilevel information Yella J, Yaddanapudi S, Wang Y, Jegga A. Changing trends in computational drug repositioning. Pharmaceuticals. 2018 Jun;11(2):57.
  • 22. The Data 22 - 765 disease-associated targets - 119401 positive interactions - 203 targets with known bioactivities - 44161 unique substances - 2766 possible repurposable drugs - 15 main genetic drivers Accumulative bioactivies for disease-associated targets No targets (accumulative) Binding affinity for disease-implicated substances
  • 23. Kernels and Similarity Metrics 23 Substances - Morgan Fingerprint radius 3 to encode substructures - Tanimoto Distance to determine substructure similarity Targets - Local Smith Waterman Alignment Harish Kandan, Understanding the kernel trick. https://towardsdatascience.com/understanding-the-kernel-trick-e0bc6112ef78
  • 24. Kernel Explosion 24 - Apply Kronecker multiplication to drug and target kernels - Train Support Vector Machine on Kronecker kernel. - Training kernel: 203 targets with known bioactivities 44161 bioactive substances 41 209 targets x 1 950 193 921 substances = 80 trillion! ~ 500TB! - I wish I had a cluster that big..
  • 25. Ensemble Learning 25 - Train multiple models! - 1. Each takes subset of data - 2. Each self-evaluates - 3. Evaluate meta-learner - 4. Feed genetic driver of CP - 5. Predict on repurposable drugs - 6. Weighted average of results - Reach optimization limit around 0.94 AUROC (for kernels of 30 substances and 30 targets). - Largest kernels still only around 1000.
  • 26. Kronecker-RLS 26 Pahikkala T, Airola A. RLScore: regularized least-squares learners. The Journal of Machine Learning Research. 2016 Jan 1;17(1):7803-7.Nascimento AC, Prudêncio RB, Costa IG. A multiple kernel learning algorithm for drug-target interaction prediction. BMC bioinformatics. 2016 Dec;17(1):46. Kashima H, Oyama S, Yamanishi Y, Tsuda K. On pairwise kernels: an efficient alternative and generalization analysis. Adv Data Min Knowl Disc. 2009; 5476:1030–7. - Take advantage of inherent symmetry - Eigendecompose similarity kernels - Take advantage of kernel ‘trick’ - Employ regularised least squares - Feed into ensemble! - Homogenous bagging ensemble performed best Final ensemble: 30 models, each: - Trained and optimized on 500 substances and 200 most bioactive targets - Evaluated (model-level) - Evaluated (ensemble-level) - Predict!
  • 27. Improvements 27  Sparse data  CGKronRLS (Semi-superversied learning)  Other pairwise relationships can be used  KronRLS-MKL (Multiple kernel learning)  Use of Guassian Interaction Profiles  Sequential model execution and storage  Boosting instead of bagging (sample level optimization)  Making numpy/BLAS work on distributed GPUs  Employ a meta-learning not voting classifier Tapio Pahikkala. Fast gradient computation for learning with tensor product kernels and sparse training labels. Structural, Syntactic, and Statistical Pattern Recognition (S+SSPR). volume 8621 of Lecture Notes in Computer Science, pages 123–132. 2014. Nascimento AC, Prudêncio RB, Costa IG. A multiple kernel learning algorithm for drug-target interaction prediction. BMC bioinformatics. 2016 Dec;17(1):46. Pahikkala T, Airola A. RLScore: regularized least-squares learners. The Journal of Machine Learning Research. 2016 Jan 1;17(1):7803-7.Nascimento AC, Prudêncio RB, Costa IG. A multiple kernel learning algorithm for drug-target interaction prediction. BMC bioinformatics. 2016 Dec;17(1):46. Kashima H, Oyama S, Yamanishi Y, Tsuda K. On pairwise kernels: an efficient alternative and generalization analysis. Adv Data Min Knowl Disc. 2009; 5476:1030–7.
  • 28. Using “compressed sensing” to support drug repurposing for chronic pancreatitis Prof. Aleksandar Poleksić Department of Computer Science University of Northern Iowa
  • 29. Compressed sensing for ADR prediction • Idea: Factor 𝑅 𝑚×𝑛 into the product of two lower dimensional matrices 𝑅 = 𝐹𝐺′
  • 30. Logistic matrix factorization Idea: Factor 𝑅 𝑚×𝑛 into the product of two lower dimensional matrices 𝑅 = 𝐹𝐺′ Loss function: 𝑖,𝑗 𝑤𝑖,𝑗{ln(1 + 𝑒 𝑓𝑖 𝑔 𝑗 ′ ) − (𝑟𝑖,𝑗+𝑞𝑖,𝑗)𝑓𝑖 𝑔𝑗 ′ } + 𝜆 𝐹 𝐹 2 2 + 𝜆 𝐺 𝐺 2 2 + 𝜆 𝑀 𝑖,𝑗 𝑚𝑖,𝑗 𝐹 𝑖, : − 𝐹 𝑗, : 2 2 + 𝜆 𝑁 𝑖,𝑗 𝑛𝑖,𝑗 𝐺 𝑖, : − 𝐺 𝑗, : 2 2 M,N – similarity matrices Q – impute matrix W- weight matrix lambdas – tunable parameters P – output probabilities
  • 31. Optimization 𝜕/𝜕𝐹 = 𝑊⨀ 𝑃 − 𝑅 + 𝑄 𝐺 + 2𝜆 𝑟 𝐹 + 2𝜆 𝑀(𝐷 𝑀 − 𝑀)𝐹 𝜕/𝜕𝐺 = {𝑊 𝑇 ⨀ 𝑃 𝑇 − 𝑅 𝑇 + 𝑄 𝑇 }𝐹 + 2𝜆 𝑟 𝐺 + 2𝜆 𝑁(𝐷 𝑁 − 𝑁)𝐺 𝑖,𝑗 𝑤𝑖,𝑗 ln 1 + 𝑒 𝑓 𝑖 𝑔 𝑗 ′ − 𝑟𝑖,𝑗 + 𝑞𝑖,𝑗 𝑓𝑖 𝑔𝑗 ′ + 𝜆 𝐹 𝐹 2 2 + 𝜆 𝐺 𝐺 2 2 + 𝜆 𝑀 𝑡𝑟 𝐹′ 𝐷 𝑀 − 𝑀 𝐹 + 𝜆 𝑁 𝑡𝑟 𝐺′ 𝐷 𝑁 − 𝑁 𝐺 Loss function: Partial derivatives: Minimization algorithm: Gradient descent
  • 33. Q: Is a new chemical likely to cause hepatotoxicity?
  • 34. Q: Is a new chemical likely to cause a serious rare side effect?
  • 35. ADR prediction for candidate CP drugs LACOSAMIDE ADRs CC(=O)NC(COC)C(=O)NCC1=CC=CC=C1 ADR_Name(CUI) Prob Nausea(C0027497) 0.99 Vomiting(C0042963) 0.97 Asthenia(C0015672) 0.916 Dizziness(C0012833) 0.912 Headache(C0018681) 0.908 Dry_mouth(C0043352) 0.881 Diarrhea(C0011991) 0.874 Dermatitis(C0015230) 0.818 Constipation(C0009806) 0.789 Somnolence(C2830004) 0.738 Tremor(C0040822) 0.733 Lacosamide ADR profile http://gpubox.cs.uni.edu
  • 36. Candidate CP drugs – network prediction Compound Disease treats resembles Drug Prob Z-score Hyoscyamine 0.35 6.61 Irinotecan 0.26 4.94 Varenicline 0.20 3.74 Octreotide 0.17 3.26 Propantheline 0.17 3.19 Citalopram 0.17 3.18 Acamprosate 0.14 2.61 Disulfiram 0.13 2.44 Epirubicin 0.12 2.31 Tamoxifen 0.11 2.08 Doxorubicin 0.11 2.05 Naltrexone 0.11 2.03 Paclitaxel 0.10 1.90 Erlotinib 0.10 1.85 Topotecan 0.10 1.83 Sorafenib 0.09 1.64 Proguanil 0.08 1.42 Metformin 0.07 1.30 Telbivudine 0.06 1.13 Orlistat 0.06 1.06 Apply compressed sensing on drug- disease network*: • 1552 compounds • 137 diseases • 755 known treatments * Himmelstein, D.S. & Baranzini, S.E. PLoS Comput Biol 11, e1004259 (2015).
  • 37. Lacosamide – network prediction Gene Compound Disease treats palliates resemblesresembles Lacosamide-binds-gene-associates pancreatitis: CFTR (0.0806) ALB (0.0522) PTGS2 (0.0224) MPO (0.0192) CYP1A1 (0.0151) ACE (0.0088) ABCB1 (0.0086) FDX1 (0.0065) CXCL8 (0.0063) TNF (0.0039) AHR (0.0035) ADRB2 (0.0028) CA8 (0.0028) SLC12A1 (0.0027) BCHE (0.0017) ADRB1 (0.0015)
  • 38. Collaborators: Prof. Lei Xie, CUNY Graduate Center References: 1. Poleksic, A., & Xie, L. (2018). Predicting serious rare adverse reactions of novel chemicals. Bioinformatics, 34(16), 2835-2842. 2. Lim, H., Gray, P., Xie, L., & Poleksic, A. (2016). Improved genome-scale multi-target virtual screening via a novel collaborative filtering approach to cold-start problem. Scientific reports, 6, 38860. 3. Poleksic, A., & Xie, L. (2019). Database of Adverse Events Associated with Drugs and Drug Combinations, in review.
  • 39. Poll Question: In what other medical area should we run the next pre-competitive research exercise? A. Oncology B. Heart Disease C. Diabetes D. Obesity E. Some other unmet need (send
  • 40. ©PistoiaAlliance Audience Q&A Please use the Question function in GoToWebinar
  • 41. ©PistoiaAlliance Upcoming Webinars 1. Date TBD – July 2019: User Experience (UX) Design for AI 2. Date TBD: Virtual Roundtable: Innovative Pathways through the FDA & EMEA (with the Westchester Biotech Project) 3. Planning: Ethics and AI Please suggest other topics

Notes de l'éditeur

  1. - No absolute line between disease modelling and target idenification. Next method illustrates this.  - This tool developed by Dr Bruce Aronow and his research group. - Cell Atlas incredible project -> this builds upon this to gain greater understanding of disease mechanism. - TODO: Labels bigger – Y AXIS cell type gene modules