SlideShare une entreprise Scribd logo
1  sur  25
1
David Amar, Tom Hait, and Ron Shamir
Blavatnik School of Computer Science
Tel Aviv University
2
Comparative genomics
 Standard expression experiments: cases vs. controls ->
differential genes -> interpretation
 Problems
 Small number of samples
 Non-specific signal
 Interpretation of a gene set/ gene ranking
 Goal: find specific changes for a tested disease
 E.g., an up-regulated pathway
 Crucial for clinical studies
3
Previous integrative classification studies
 Huang et al. 2010 PNAS (9,160 samples); Schmid et al.
PNAS 2012 (3,030); Lee et al. Bioinformatics 2013 (~14,000)
 Multilabel classification
 Global expression patterns
 Only 1-3 platforms
 Many datasets were removed from GEO
 No “healthy” class (Huang);No diseases (Lee)
 Pathprint (Altschuler et al. 2013)
 Use pathways
 Tissue classification (as in Lee et al.)
4
Integrating pathways and molecular
profiles
 Enrichment tests
 Improves interpretability
 GSEAGSA
 Ranked based
 Higher statistical power
 Classification
 Extract pathway features
 Example: given a pathway remove non-differential genes
 Not clear if prediction performance improves
compared to using genes (Staiger et al. 2013)
5
6
Pathways
KEGG Reactome
Biocarta NCI
Expression
profiles
GSE
GDS
TCGA
Sample labels
Disease
Datasetsample
description
Single sample - single
pathway analysis
For each
pathway
• Mean
• SD
Y
Samples
XP
Pathway features
Platform
data
Single sample analysis
Ranked
genes
transcripts
Sample j
Weighted
ranks
/i k
iW ie

Standardized
profile
low
expression
high
expression
7
Single sample analysis
 Input: an expression profile of a sample
 A vector of real values for each patient
 Step 1: rank the genes
 Step 2: calculate a score for each gene
Rank of
gene g in
sample s
Total number
of ranked
genes
(Yang et al. 2012,2013)
8
Pathway features
 1723 pathways in total
 Covering 7842 genes
 Mean size: 36.35 (median 15)
 Score all genes that are in the pathway databases
 Pathway statistics:
 Mean score
 Standard deviation
 Skewness
 KS test
Pathway DBs
KEGG Reactome
Biocarta NCI
9
Patient labels
 Unite ~180 datasets, >14,000 samples
 Public databases contain ‘free text’
 Problem: automatic mapping fails,
example:
 GDS4358:” lymph-node biopsies
from classic Hodgkins lymphoma
HIV- patients before ABVD
chemotherapy”
 MetaMap top score: “HIV infections”
 Solution: manual analysis
 Read descriptions and papers
10
Current microarray data
 Data from GEO
 13,314 samples
 17 platforms
 Sample annotation
 Ignore terms with less than
 100 samples
 5 datasets
 48 disease terms
Disease terms
XP
Samples
Pathway features
Y
Disease terms {0,1}
Samples 11
12
Multi-label classification algorithms
 Learn a single classifier for each disease
 Ignore class dependencies
 Adaptation: Bayesian Correction
 Learn single classifiers
 Correct errors using the DO DAG
 Transformation: use the label power
sets and learn a multiclass model
 Using RF: multi-label trees
 Was better than most approaches in an
experimental study (Madjarov et al. 2012)
13
How to validate an classifier?
 Use leave-dataset out cross-validation
 Global AUC scores: each prediction Pij vs the correct label Yij
 Disease based AUC scores: consider each column separately
14
Y
Disease terms {0,1}
Samples
P
Probabilities [0,1]
Samples
The output of a multi-label learner
Test set
A problem (!)
 What is in the background?
 For a disease D define:
 Positives: disease samples
 Negatives: direct controls
 Background controls
15
Example:
500 positives
500 negatives
10000 BGCs
Y
P
Multistep validation
16
 It is recommended to use several scores (Lee et al. 2013)
 Measure global AUPR
 For each disease we calculate three scores
Measure Used (additional)
information
AUPR: check separation between positives and
all others
Sick vs. not sick
ROC: test for separation between positives and
negatives
Direct use of negatives
Meta analysis p-value: calculate the overall
separation significance within the original
datasets (a p-value)
Mapping of samples to
datasets
Performance results
17
Meta analysis q-value < 0.001 (filled boxes)
Positives vs. negatives ROC
AUPR
Performance results
18
8.5% improvement in
recall, 12% in precision,
compared to Huang et al.
Validation on RNA-Seq
Data from TCGA: 1,699 samples
19
Pathway-Disease network
 Steps (for each of the selected diseases):
1. Disease-pathway edges
1. RF importance: Select the top features
2. Test for disease relevance
2. Add edges between diseases
1. Use the DO structure
3. Add edges between pathways
1. Based on significant overlap in genes
20
Cancer network
Down
Up
Cardiovascular disease
23
Down
Up
Gastric cancers
Summary
 Large scale integration
 Multi-label learning
 Careful validation
 Pathway based features as biomarkers
 Summary of the results in a network
 Currently
 Add genes: overcome missing values
 Shows improvement in validation
25
Acknowledgements
 Ron Shamir
 Tom Hait

Contenu connexe

Tendances

Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Seattle DAML meetup
 
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression DatabaseКолкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Databasebigdatabm
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Cresset
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Philip Bourne
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informaticsDaniela Rotariu
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...laserxiong
 
Sigma Xi 2021 Andrew Gao Presentation
Sigma Xi 2021 Andrew Gao PresentationSigma Xi 2021 Andrew Gao Presentation
Sigma Xi 2021 Andrew Gao PresentationAndrewGao12
 
AI in Bioinformatics
AI in BioinformaticsAI in Bioinformatics
AI in BioinformaticsAli Kishk
 
NetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioNetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioAlexander Pico
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartAlexander Pico
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management inscit2006
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuAlexander Pico
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...Enrico Glaab
 
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...CSCJournals
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsPragya Pai
 

Tendances (20)

iOmics
iOmicsiOmics
iOmics
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
 
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression DatabaseКолкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informatics
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...
 
Sigma Xi 2021 Andrew Gao Presentation
Sigma Xi 2021 Andrew Gao PresentationSigma Xi 2021 Andrew Gao Presentation
Sigma Xi 2021 Andrew Gao Presentation
 
AI in Bioinformatics
AI in BioinformaticsAI in Bioinformatics
AI in Bioinformatics
 
NetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbioNetBioSIG2012 ugurdogrusoz-cbio
NetBioSIG2012 ugurdogrusoz-cbio
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver Hart
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
Bioinformatics Projects And Applications
Bioinformatics Projects And ApplicationsBioinformatics Projects And Applications
Bioinformatics Projects And Applications
 
NTU-2019
NTU-2019NTU-2019
NTU-2019
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang Su
 
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...
 
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
Comparing Genetic Evolutionary Algorithms on Three Enzymes of HIV-1: Integras...
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
 

Similaire à NetBioSIG2014-Talk by David Amar

Critical appraisal of meta-analysis
Critical appraisal of meta-analysisCritical appraisal of meta-analysis
Critical appraisal of meta-analysisSamir Haffar
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challengesinside-BigData.com
 
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...European School of Oncology
 
Readmission of Diabetes Patients Report
Readmission of Diabetes Patients ReportReadmission of Diabetes Patients Report
Readmission of Diabetes Patients ReportHong Lu
 
Multivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic DataMultivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic DataUC Davis
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...OSUCCC - James
 
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...The Research Council of Norway, IKTPLUSS
 
Farmacoepi Course Leiden 0210 Part 2
Farmacoepi Course Leiden 0210   Part 2Farmacoepi Course Leiden 0210   Part 2
Farmacoepi Course Leiden 0210 Part 2RobHeerdink
 
Talk at Yale University April 26th 2011: Applying Computational Models for To...
Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...
Talk at Yale University April 26th 2011: Applying Computational Models for To...Sean Ekins
 
Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Enrico Glaab
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experimentsHelena Deus
 
Analysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure ControlAnalysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure ControlHealth Informatics New Zealand
 
Bioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyBioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyJoaquin Dopazo
 
Clinical trial bms clinical trials methodology 17012018
Clinical trial bms   clinical trials methodology 17012018Clinical trial bms   clinical trials methodology 17012018
Clinical trial bms clinical trials methodology 17012018SoM
 
2010 smg training_cardiff_day2_session3_dwan_altman
2010 smg training_cardiff_day2_session3_dwan_altman2010 smg training_cardiff_day2_session3_dwan_altman
2010 smg training_cardiff_day2_session3_dwan_altmanrgveroniki
 
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Sean Ekins
 
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...DataScienceConferenc1
 

Similaire à NetBioSIG2014-Talk by David Amar (20)

Critical appraisal of meta-analysis
Critical appraisal of meta-analysisCritical appraisal of meta-analysis
Critical appraisal of meta-analysis
 
Metaanalysis copy
Metaanalysis    copyMetaanalysis    copy
Metaanalysis copy
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
 
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
Gene Profiling in Clinical Oncology - Slide 9 - F. André - Genomic evaluation...
 
Readmission of Diabetes Patients Report
Readmission of Diabetes Patients ReportReadmission of Diabetes Patients Report
Readmission of Diabetes Patients Report
 
Kishor Presentation
Kishor PresentationKishor Presentation
Kishor Presentation
 
Multivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic DataMultivariate Analysis and Visualization of Proteomic Data
Multivariate Analysis and Visualization of Proteomic Data
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
 
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
Norwegian clinical genetics analysis platform ”genAP”, Thomas Grünfeld and To...
 
Farmacoepi Course Leiden 0210 Part 2
Farmacoepi Course Leiden 0210   Part 2Farmacoepi Course Leiden 0210   Part 2
Farmacoepi Course Leiden 0210 Part 2
 
Talk at Yale University April 26th 2011: Applying Computational Models for To...
Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...
Talk at Yale University April 26th 2011: Applying Computational Models for To...
 
Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
 
Analysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure ControlAnalysis of Medication Possession Ratio for Improved Blood Pressure Control
Analysis of Medication Possession Ratio for Improved Blood Pressure Control
 
Bioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyBioinformatics in dermato-oncology
Bioinformatics in dermato-oncology
 
Clinical trial bms clinical trials methodology 17012018
Clinical trial bms   clinical trials methodology 17012018Clinical trial bms   clinical trials methodology 17012018
Clinical trial bms clinical trials methodology 17012018
 
2010 smg training_cardiff_day2_session3_dwan_altman
2010 smg training_cardiff_day2_session3_dwan_altman2010 smg training_cardiff_day2_session3_dwan_altman
2010 smg training_cardiff_day2_session3_dwan_altman
 
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
 
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
[DigiHealth 22] Budget friendly sample sizes for genomics research - Ognjen M...
 

Plus de Alexander Pico

NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018Alexander Pico
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017Alexander Pico
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 TutorialAlexander Pico
 
NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallAlexander Pico
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsAlexander Pico
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksAlexander Pico
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Alexander Pico
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 TutorialAlexander Pico
 
NetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerNetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerAlexander Pico
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioAlexander Pico
 
NetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoNetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoAlexander Pico
 
NetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicNetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicAlexander Pico
 
NetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaNetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaAlexander Pico
 
NetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutNetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutAlexander Pico
 
NetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilNetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilAlexander Pico
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoAlexander Pico
 
NetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonNetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonAlexander Pico
 
Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks Alexander Pico
 
NRNB Annual Report 2013
NRNB Annual Report 2013NRNB Annual Report 2013
NRNB Annual Report 2013Alexander Pico
 
Introduction to WikiPathways
Introduction to WikiPathwaysIntroduction to WikiPathways
Introduction to WikiPathwaysAlexander Pico
 

Plus de Alexander Pico (20)

NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018
 
NRNB Annual Report 2017
NRNB Annual Report 2017NRNB Annual Report 2017
NRNB Annual Report 2017
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial
 
NRNB Annual Report 2016: Overall
NRNB Annual Report 2016: OverallNRNB Annual Report 2016: Overall
NRNB Annual Report 2016: Overall
 
Technology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network RepresentationsTechnology R&D Theme 3: Multi-scale Network Representations
Technology R&D Theme 3: Multi-scale Network Representations
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential Networks
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial
 
NetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank KramerNetBioSIG2014-FlashJournalClub by Frank Kramer
NetBioSIG2014-FlashJournalClub by Frank Kramer
 
NetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore LoguercioNetBioSIG2014-Talk by Salvatore Loguercio
NetBioSIG2014-Talk by Salvatore Loguercio
 
NetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex PicoNetBioSIG2014-Intro by Alex Pico
NetBioSIG2014-Intro by Alex Pico
 
NetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana MilenkovicNetBioSIG2014-Talk by Tijana Milenkovic
NetBioSIG2014-Talk by Tijana Milenkovic
 
NetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu XiaNetBioSIG2014-Talk by Yu Xia
NetBioSIG2014-Talk by Yu Xia
 
NetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian WalhoutNetBioSIG2014-Keynote by Marian Walhout
NetBioSIG2014-Keynote by Marian Walhout
 
NetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini PatilNetBioSIG2014-Talk by Ashwini Patil
NetBioSIG2014-Talk by Ashwini Patil
 
NetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon ChoNetBioSIG2014-Talk by Hyunghoon Cho
NetBioSIG2014-Talk by Hyunghoon Cho
 
NetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald QuonNetBioSIG2014-Talk by Gerald Quon
NetBioSIG2014-Talk by Gerald Quon
 
Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks Visualization and Analysis of Dynamic Networks
Visualization and Analysis of Dynamic Networks
 
NRNB Annual Report 2013
NRNB Annual Report 2013NRNB Annual Report 2013
NRNB Annual Report 2013
 
Introduction to WikiPathways
Introduction to WikiPathwaysIntroduction to WikiPathways
Introduction to WikiPathways
 

Dernier

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfTukamushabaBismark
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 

Dernier (20)

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 

NetBioSIG2014-Talk by David Amar

  • 1. 1 David Amar, Tom Hait, and Ron Shamir Blavatnik School of Computer Science Tel Aviv University
  • 2. 2
  • 3. Comparative genomics  Standard expression experiments: cases vs. controls -> differential genes -> interpretation  Problems  Small number of samples  Non-specific signal  Interpretation of a gene set/ gene ranking  Goal: find specific changes for a tested disease  E.g., an up-regulated pathway  Crucial for clinical studies 3
  • 4. Previous integrative classification studies  Huang et al. 2010 PNAS (9,160 samples); Schmid et al. PNAS 2012 (3,030); Lee et al. Bioinformatics 2013 (~14,000)  Multilabel classification  Global expression patterns  Only 1-3 platforms  Many datasets were removed from GEO  No “healthy” class (Huang);No diseases (Lee)  Pathprint (Altschuler et al. 2013)  Use pathways  Tissue classification (as in Lee et al.) 4
  • 5. Integrating pathways and molecular profiles  Enrichment tests  Improves interpretability  GSEAGSA  Ranked based  Higher statistical power  Classification  Extract pathway features  Example: given a pathway remove non-differential genes  Not clear if prediction performance improves compared to using genes (Staiger et al. 2013) 5
  • 6. 6
  • 7. Pathways KEGG Reactome Biocarta NCI Expression profiles GSE GDS TCGA Sample labels Disease Datasetsample description Single sample - single pathway analysis For each pathway • Mean • SD Y Samples XP Pathway features Platform data Single sample analysis Ranked genes transcripts Sample j Weighted ranks /i k iW ie  Standardized profile low expression high expression 7
  • 8. Single sample analysis  Input: an expression profile of a sample  A vector of real values for each patient  Step 1: rank the genes  Step 2: calculate a score for each gene Rank of gene g in sample s Total number of ranked genes (Yang et al. 2012,2013) 8
  • 9. Pathway features  1723 pathways in total  Covering 7842 genes  Mean size: 36.35 (median 15)  Score all genes that are in the pathway databases  Pathway statistics:  Mean score  Standard deviation  Skewness  KS test Pathway DBs KEGG Reactome Biocarta NCI 9
  • 10. Patient labels  Unite ~180 datasets, >14,000 samples  Public databases contain ‘free text’  Problem: automatic mapping fails, example:  GDS4358:” lymph-node biopsies from classic Hodgkins lymphoma HIV- patients before ABVD chemotherapy”  MetaMap top score: “HIV infections”  Solution: manual analysis  Read descriptions and papers 10
  • 11. Current microarray data  Data from GEO  13,314 samples  17 platforms  Sample annotation  Ignore terms with less than  100 samples  5 datasets  48 disease terms Disease terms XP Samples Pathway features Y Disease terms {0,1} Samples 11
  • 12. 12
  • 13. Multi-label classification algorithms  Learn a single classifier for each disease  Ignore class dependencies  Adaptation: Bayesian Correction  Learn single classifiers  Correct errors using the DO DAG  Transformation: use the label power sets and learn a multiclass model  Using RF: multi-label trees  Was better than most approaches in an experimental study (Madjarov et al. 2012) 13
  • 14. How to validate an classifier?  Use leave-dataset out cross-validation  Global AUC scores: each prediction Pij vs the correct label Yij  Disease based AUC scores: consider each column separately 14 Y Disease terms {0,1} Samples P Probabilities [0,1] Samples The output of a multi-label learner Test set
  • 15. A problem (!)  What is in the background?  For a disease D define:  Positives: disease samples  Negatives: direct controls  Background controls 15 Example: 500 positives 500 negatives 10000 BGCs Y P
  • 16. Multistep validation 16  It is recommended to use several scores (Lee et al. 2013)  Measure global AUPR  For each disease we calculate three scores Measure Used (additional) information AUPR: check separation between positives and all others Sick vs. not sick ROC: test for separation between positives and negatives Direct use of negatives Meta analysis p-value: calculate the overall separation significance within the original datasets (a p-value) Mapping of samples to datasets
  • 17. Performance results 17 Meta analysis q-value < 0.001 (filled boxes) Positives vs. negatives ROC AUPR
  • 18. Performance results 18 8.5% improvement in recall, 12% in precision, compared to Huang et al.
  • 19. Validation on RNA-Seq Data from TCGA: 1,699 samples 19
  • 20. Pathway-Disease network  Steps (for each of the selected diseases): 1. Disease-pathway edges 1. RF importance: Select the top features 2. Test for disease relevance 2. Add edges between diseases 1. Use the DO structure 3. Add edges between pathways 1. Based on significant overlap in genes 20
  • 24. Summary  Large scale integration  Multi-label learning  Careful validation  Pathway based features as biomarkers  Summary of the results in a network  Currently  Add genes: overcome missing values  Shows improvement in validation 25