SlideShare une entreprise Scribd logo
1  sur  15
Speaker: Enrico Glaab, Luxembourg Centre for Systems Biomedicine
EnrichNet: network-based gene set enrichment analysis
Authors: Enrico Glaab, Anaïs Baudot, Natalio Krasnogor, Reinhard Schneider, Alfonso Valencia
1
Motivation
How to identify and score functional associations between a gene/protein set of
interest (target set) and a collection of known, annotated gene/protein sets
(reference sets), representing cellular pathways, processes or complexes?
Problem:
Functional annotation/pathway
databases (reference sets)
Experimentally-derived
gene/protein set (target set)
2
Previous approaches
Previous gene/protein set enrichment analyses techniques:
Three types of enrichment analysis approaches (see Huang et al., Nucleic Acid Res, 2009):
• Over-representation analysis (ORA)
• Gene Set Enrichment Analysis (GSEA)
• Integrative and modular enrichment analysis (MEA)
generally applicable, but scores often not discriminative, rankings difficult to interpret biologically
quantitative measurements required, molecular network neighbourhood not taken into account
mostly use clustering of annotations or data from ontology graphs rather than molecular networks
GOAL: Maximally exploit functional information from a molecular interaction
network for association scoring and visualization
3
EnrichNet: Design principles (1)
Network association measure for mapped datasets:
account for distances in a molecular network and multiplicity and density of interactions between
the datasets of interest (use random walk distances instead of shortest paths distances)
Example sub-networks:
reference node
target set node
other nodes
Case 1:
dense inter-
connections
Case 2:
sparse inter-
connections
4
EnrichNet: Design principles (2)
Handling of overlapping nodes and long distance outliers:
overlapping nodes and node pairs with small distances expected to be over-represented in
functionally associated datasets: assign heigher weight to short distance node pairs
account for outlier nodes: assign lower weight to long distance node pairs
Example sub-network:
outlier
(low weight)
outlier
(low weight)
pathway node
target set node
other nodes
overlap (high weight)
5
EnrichNet: Procedure
Input:
• 10 or more human gene or protein identifiers of interest (= target set)
• Selection of a reference database (gene sets from GO, KEGG, BioCarta, Reactome,
WikiPathways, PID, etc.)
Processing (details on next slides):
• Target and reference datasets are mapped onto a human genome-scale molecular network
(default: STRING confidence-weighted PPI network, optional: user-defined network)
• Random walk with restart (RWR) algorithm applied to compute node-specific association scores
between mapped target set and reference sets
• Integration of scores for each reference set and comparison against background model
Output:
• Ranking table of reference pathways with association scores (optional: 60 tissue-specific scores)
• For each reference dataset: Interactive sub-network visualization of the association with target set
6
EnrichNet: Random walk with restart (RWR)
RWR relevance scoring (Tong et al., 2006):
Simulate random walks via iterative matrix
multiplications:
pt+1 = (1-r) A pt + p0
• A:= network adjacency matrix
• r:= restart probability (here: r = 0.9)
• pi
t:= probability walker is at node i at time t
Result: a vector of node relevance scores for
each reference pathway (converted to
distance scores and compared against a
background model, see next slide)
Example network:
target set target/pathway overlap
pathway 1 pathway 2
7
EnrichNet: Background model
Pathway-based background model:
• Gene/protein sets for background model should have similar connectivity properties as
pathway-representing reference nodes (not the case for random matched-size node sets)
use score distribution across the entire reference database as background
(n = number of equally spaced distance bins, default: n = 10;
Tissue-specific scores: pre-filter nodes by tissue-label)
• Apply Xd-distance (Olmea et al., 1999) to compare foreground against background distances
distance-dependent weighting (account for long-distance and high degree outliers)
8
EnrichNet: Comparative analysis
Comparative analysis on benchmark microarray data:
• compare EnrichNet against classical over-representation analysis using benchmark datasets from
the Broad Institute of MIT and Harvard (5 gene expression datasets and 2 reference databases)
EnrichNet provides a consistently higher agreement with benchmark gene set rankings
9
EnrichNet: Results
Biological application on disease-related gene sets
EnrichNet is suited in particular for the following settings:
1) Target gene/protein set of interest has no associated high-throughput experimental data:
Examples: Mutated genes in genetic diseases (OMIM, COSMIC, CGC)
Gene sets obtained from the literature (risk factors, animal model genes)
2) Target and reference set share few members but are densely connected in the network:
Examples: Occurs often for differentially expressed genes (DEGs) in complex
phenotypes (examples for Parkinson‘s disease on next slides)
Occurs often when integrating results from different studies or omics
(e.g. comparing transcriptomics and proteomics data)
10
DEGs for Parkinson‘s disease (PD) vs. KEGG PD pathway
• DEGs in PD vs.
control samples
• KEGG Parkinson‘s
disease pathway
• Overlap
OPA1 mediates
mitochondrial fusion
NR4A2 mutations have been
associated with familial PD
11
DEGs for PD vs. exocytosis regulation pathway
• DEGs in PD vs.
control samples
• Regulation of exocytosis
process (Gene Ontology)
• Overlap
12
Summary
• EnrichNet provides a new means to score and interpret gene/protein set
associations by exploiting functional information captured in the graph structure
of molecular networks
• New functional associations are identified and sub-network visualizations
enable a biological interpretation on the level of single molecular interactions
13
Availability
Software, tutorials and examples freely available at:
www.enrichnet.org
We acknowledge support by:
14
References
References
1. E. Glaab, A. Baudot, N. Krasnogor, R. Schneider, A. Valencia. EnrichNet: network-based gene set enrichment analysis,
Bioinformatics, 28(18):i451-i457, 2012
2. E. Glaab, R. Schneider, PathVar: analysis of gene and protein expression variance in cellular pathways using microarray
data, Bioinformatics, 28(3):446-447, 2012
3. E. Glaab, J. Bacardit, J. M. Garibaldi, N. Krasnogor, Using rule-based machine learning for candidate disease gene
prioritization and sample classification of cancer gene expression data, PLoS ONE, 7(7):e39932, 2012
4. E. Glaab, A. Baudot, N. Krasnogor, A. Valencia. TopoGSA: network topological gene set analysis,
Bioinformatics, 26(9):1271-1272, 2010
5. E. Glaab, A. Baudot, N. Krasnogor, A. Valencia. Extending pathways and processes using molecular interaction networks
to analyse cancer genome data, BMC Bioinformatics, 11(1):597, 2010
6. H. O. Habashy, D. G. Powe, E. Glaab, N. Krasnogor, J. M. Garibaldi, E. A. Rakha, G. Ball, A. R Green, C. Caldas, I. O.
Ellis, RERG (Ras-related and oestrogen-regulated growth-inhibitor) expression in breast cancer: A marker of ER-positive
luminal-like subtype, Breast Cancer Research and Treatment, 128(2):315-326, 2011
7. E. Glaab, J. M. Garibaldi and N. Krasnogor. ArrayMining: a modular web-application for microarray analysis combining
ensemble and consensus methods with cross-study normalization, BMC Bioinformatics,10:358, 2009
8. E. Glaab, J. M. Garibaldi, N. Krasnogor. Learning pathway-based decision rules to classify microarray cancer samples,
German Conference on Bioinformatics 2010, Lecture Notes in Informatics (LNI), 173, 123-134
9. E. Glaab, J. M. Garibaldi and N. Krasnogor. VRMLGen: An R-package for 3D Data Visualization on the Web, Journal of
Statistical Software, 36(8),1-18, 2010

Contenu connexe

Tendances

Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Yasel Cruz
 
Network-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataNetwork-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataSOYEON KIM
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksAlexander Pico
 
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...SOYEON KIM
 
Bioinformatics Final Report
Bioinformatics Final ReportBioinformatics Final Report
Bioinformatics Final ReportShruthi Choudary
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisSOYEON KIM
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei LinChien-Wei Lin
 
Gene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialGene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialDmitry Grapov
 
Systems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSystems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSOYEON KIM
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksAlexander Pico
 
Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisationBiogeeks
 
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MININGANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MININGijbbjournal
 
Complex Systems Biology Informed Data Analysis and Machine Learning
Complex Systems Biology Informed Data Analysis and Machine LearningComplex Systems Biology Informed Data Analysis and Machine Learning
Complex Systems Biology Informed Data Analysis and Machine LearningDmitry Grapov
 
Metabolic Network Analysis
Metabolic Network AnalysisMetabolic Network Analysis
Metabolic Network AnalysisMas Kot
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informaticsDaniela Rotariu
 
NetBioSIG2013-Talk Martina Kutmon
NetBioSIG2013-Talk Martina KutmonNetBioSIG2013-Talk Martina Kutmon
NetBioSIG2013-Talk Martina KutmonAlexander Pico
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Alexander Pico
 
Link Analysis of Life Sciences Linked Data
Link Analysis of Life Sciences Linked DataLink Analysis of Life Sciences Linked Data
Link Analysis of Life Sciences Linked DataMichel Dumontier
 
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and PrimersGASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and Primersijdmtaiir
 
Proteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data setsProteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data setsLars Juhl Jensen
 

Tendances (20)

Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244
 
Network-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal dataNetwork-based machine learning approach for aggregating multi-modal data
Network-based machine learning approach for aggregating multi-modal data
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive Networks
 
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
 
Bioinformatics Final Report
Bioinformatics Final ReportBioinformatics Final Report
Bioinformatics Final Report
 
A survey of heterogeneous information network analysis
A survey of heterogeneous information network analysisA survey of heterogeneous information network analysis
A survey of heterogeneous information network analysis
 
Research Statement Chien-Wei Lin
Research Statement Chien-Wei LinResearch Statement Chien-Wei Lin
Research Statement Chien-Wei Lin
 
Gene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -TutorialGene Ontology Enrichment Network Analysis -Tutorial
Gene Ontology Enrichment Network Analysis -Tutorial
 
Systems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traitsSystems genetics approaches to understand complex traits
Systems genetics approaches to understand complex traits
 
Technology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential NetworksTechnology R&D Theme 1: Differential Networks
Technology R&D Theme 1: Differential Networks
 
Ondex: Data integration and visualisation
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisation
 
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MININGANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
ANALYSIS OF PROTEIN MICROARRAY DATA USING DATA MINING
 
Complex Systems Biology Informed Data Analysis and Machine Learning
Complex Systems Biology Informed Data Analysis and Machine LearningComplex Systems Biology Informed Data Analysis and Machine Learning
Complex Systems Biology Informed Data Analysis and Machine Learning
 
Metabolic Network Analysis
Metabolic Network AnalysisMetabolic Network Analysis
Metabolic Network Analysis
 
Project report-on-bio-informatics
Project report-on-bio-informaticsProject report-on-bio-informatics
Project report-on-bio-informatics
 
NetBioSIG2013-Talk Martina Kutmon
NetBioSIG2013-Talk Martina KutmonNetBioSIG2013-Talk Martina Kutmon
NetBioSIG2013-Talk Martina Kutmon
 
Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020Overall Vision for NRNB: 2015-2020
Overall Vision for NRNB: 2015-2020
 
Link Analysis of Life Sciences Linked Data
Link Analysis of Life Sciences Linked DataLink Analysis of Life Sciences Linked Data
Link Analysis of Life Sciences Linked Data
 
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and PrimersGASCAN: A Novel Database for Gastric Cancer Genes and Primers
GASCAN: A Novel Database for Gastric Cancer Genes and Primers
 
Proteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data setsProteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data sets
 

En vedette (20)

Seminarbrief2015pub
Seminarbrief2015pubSeminarbrief2015pub
Seminarbrief2015pub
 
My Timetable Ana Tudor English Teacher
My Timetable   Ana Tudor   English TeacherMy Timetable   Ana Tudor   English Teacher
My Timetable Ana Tudor English Teacher
 
Stearns Lending, Inc. | People. Power. Possibilities.
Stearns Lending, Inc. | People. Power. Possibilities.Stearns Lending, Inc. | People. Power. Possibilities.
Stearns Lending, Inc. | People. Power. Possibilities.
 
Daniela
DanielaDaniela
Daniela
 
Biblioteca Central
Biblioteca CentralBiblioteca Central
Biblioteca Central
 
Learning Services Request Intake Process
Learning Services Request Intake ProcessLearning Services Request Intake Process
Learning Services Request Intake Process
 
Perfumes(1)
Perfumes(1)Perfumes(1)
Perfumes(1)
 
Power Point
Power PointPower Point
Power Point
 
Calendario2
Calendario2Calendario2
Calendario2
 
Cursos de música ACN
Cursos de música ACNCursos de música ACN
Cursos de música ACN
 
Ainara
AinaraAinara
Ainara
 
2 enfoque didactico
2 enfoque didactico2 enfoque didactico
2 enfoque didactico
 
Universidad de panamá
Universidad de panamáUniversidad de panamá
Universidad de panamá
 
4조 정호야 돌아와조
4조 정호야 돌아와조4조 정호야 돌아와조
4조 정호야 돌아와조
 
Cómo hacer diapositivas
Cómo hacer diapositivasCómo hacer diapositivas
Cómo hacer diapositivas
 
Ziegel 2015
 Ziegel 2015 Ziegel 2015
Ziegel 2015
 
Camoil 2015
Camoil  2015 Camoil  2015
Camoil 2015
 
Modelling gender-specific regulation of tau in Alzheimer’s disease
Modelling gender-specific regulation of tau in Alzheimer’s diseaseModelling gender-specific regulation of tau in Alzheimer’s disease
Modelling gender-specific regulation of tau in Alzheimer’s disease
 
Ziegel fulva tomates
Ziegel fulva tomatesZiegel fulva tomates
Ziegel fulva tomates
 
Agricultura moderna jesus vidal
Agricultura moderna jesus vidalAgricultura moderna jesus vidal
Agricultura moderna jesus vidal
 

Similaire à EnrichNet: Graph-based statistic and web-application for gene/protein set enrichment analysis

Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Enrico Glaab
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Natalio Krasnogor
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Alexander Decker
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923GenomeInABottle
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...Chris Evelo
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
 
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...Syed Ahmad Chan Bukhari, PhD
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsNatalio Krasnogor
 
Prote-OMIC Data Analysis and Visualization
Prote-OMIC Data Analysis and VisualizationProte-OMIC Data Analysis and Visualization
Prote-OMIC Data Analysis and VisualizationDmitry Grapov
 
Quantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To BrowserQuantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To BrowserNeil Swainston
 
Network Biology Lent 2010 - lecture 1
Network Biology Lent 2010 - lecture 1Network Biology Lent 2010 - lecture 1
Network Biology Lent 2010 - lecture 1Florian Markowetz
 
STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...Lars Juhl Jensen
 
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...Sara Alvarez
 

Similaire à EnrichNet: Graph-based statistic and web-application for gene/protein set enrichment analysis (20)

Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)
 
presentation
presentationpresentation
presentation
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
 
Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...Comparative study of artificial neural network based classification for liver...
Comparative study of artificial neural network based classification for liver...
 
Kishor Presentation
Kishor PresentationKishor Presentation
Kishor Presentation
 
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
Using accurate long reads to improve Genome in a Bottle Benchmarks 220923
 
1207.2600
1207.26001207.2600
1207.2600
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...
 
D1803012022
D1803012022D1803012022
D1803012022
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
MiAIRR:Minimum information about an Adaptive Immune Receptor Repertoire Seque...
 
NTU-2019
NTU-2019NTU-2019
NTU-2019
 
May 15 workshop
May 15  workshopMay 15  workshop
May 15 workshop
 
May workshop
May workshopMay workshop
May workshop
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric Bioinformatics
 
Prote-OMIC Data Analysis and Visualization
Prote-OMIC Data Analysis and VisualizationProte-OMIC Data Analysis and Visualization
Prote-OMIC Data Analysis and Visualization
 
Quantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To BrowserQuantitative Proteomics: From Instrument To Browser
Quantitative Proteomics: From Instrument To Browser
 
Network Biology Lent 2010 - lecture 1
Network Biology Lent 2010 - lecture 1Network Biology Lent 2010 - lecture 1
Network Biology Lent 2010 - lecture 1
 
STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...STRING - Prediction of a functional association network for the yeast mitocho...
STRING - Prediction of a functional association network for the yeast mitocho...
 
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
 

Dernier

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 

Dernier (20)

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 

EnrichNet: Graph-based statistic and web-application for gene/protein set enrichment analysis

  • 1. Speaker: Enrico Glaab, Luxembourg Centre for Systems Biomedicine EnrichNet: network-based gene set enrichment analysis Authors: Enrico Glaab, Anaïs Baudot, Natalio Krasnogor, Reinhard Schneider, Alfonso Valencia
  • 2. 1 Motivation How to identify and score functional associations between a gene/protein set of interest (target set) and a collection of known, annotated gene/protein sets (reference sets), representing cellular pathways, processes or complexes? Problem: Functional annotation/pathway databases (reference sets) Experimentally-derived gene/protein set (target set)
  • 3. 2 Previous approaches Previous gene/protein set enrichment analyses techniques: Three types of enrichment analysis approaches (see Huang et al., Nucleic Acid Res, 2009): • Over-representation analysis (ORA) • Gene Set Enrichment Analysis (GSEA) • Integrative and modular enrichment analysis (MEA) generally applicable, but scores often not discriminative, rankings difficult to interpret biologically quantitative measurements required, molecular network neighbourhood not taken into account mostly use clustering of annotations or data from ontology graphs rather than molecular networks GOAL: Maximally exploit functional information from a molecular interaction network for association scoring and visualization
  • 4. 3 EnrichNet: Design principles (1) Network association measure for mapped datasets: account for distances in a molecular network and multiplicity and density of interactions between the datasets of interest (use random walk distances instead of shortest paths distances) Example sub-networks: reference node target set node other nodes Case 1: dense inter- connections Case 2: sparse inter- connections
  • 5. 4 EnrichNet: Design principles (2) Handling of overlapping nodes and long distance outliers: overlapping nodes and node pairs with small distances expected to be over-represented in functionally associated datasets: assign heigher weight to short distance node pairs account for outlier nodes: assign lower weight to long distance node pairs Example sub-network: outlier (low weight) outlier (low weight) pathway node target set node other nodes overlap (high weight)
  • 6. 5 EnrichNet: Procedure Input: • 10 or more human gene or protein identifiers of interest (= target set) • Selection of a reference database (gene sets from GO, KEGG, BioCarta, Reactome, WikiPathways, PID, etc.) Processing (details on next slides): • Target and reference datasets are mapped onto a human genome-scale molecular network (default: STRING confidence-weighted PPI network, optional: user-defined network) • Random walk with restart (RWR) algorithm applied to compute node-specific association scores between mapped target set and reference sets • Integration of scores for each reference set and comparison against background model Output: • Ranking table of reference pathways with association scores (optional: 60 tissue-specific scores) • For each reference dataset: Interactive sub-network visualization of the association with target set
  • 7. 6 EnrichNet: Random walk with restart (RWR) RWR relevance scoring (Tong et al., 2006): Simulate random walks via iterative matrix multiplications: pt+1 = (1-r) A pt + p0 • A:= network adjacency matrix • r:= restart probability (here: r = 0.9) • pi t:= probability walker is at node i at time t Result: a vector of node relevance scores for each reference pathway (converted to distance scores and compared against a background model, see next slide) Example network: target set target/pathway overlap pathway 1 pathway 2
  • 8. 7 EnrichNet: Background model Pathway-based background model: • Gene/protein sets for background model should have similar connectivity properties as pathway-representing reference nodes (not the case for random matched-size node sets) use score distribution across the entire reference database as background (n = number of equally spaced distance bins, default: n = 10; Tissue-specific scores: pre-filter nodes by tissue-label) • Apply Xd-distance (Olmea et al., 1999) to compare foreground against background distances distance-dependent weighting (account for long-distance and high degree outliers)
  • 9. 8 EnrichNet: Comparative analysis Comparative analysis on benchmark microarray data: • compare EnrichNet against classical over-representation analysis using benchmark datasets from the Broad Institute of MIT and Harvard (5 gene expression datasets and 2 reference databases) EnrichNet provides a consistently higher agreement with benchmark gene set rankings
  • 10. 9 EnrichNet: Results Biological application on disease-related gene sets EnrichNet is suited in particular for the following settings: 1) Target gene/protein set of interest has no associated high-throughput experimental data: Examples: Mutated genes in genetic diseases (OMIM, COSMIC, CGC) Gene sets obtained from the literature (risk factors, animal model genes) 2) Target and reference set share few members but are densely connected in the network: Examples: Occurs often for differentially expressed genes (DEGs) in complex phenotypes (examples for Parkinson‘s disease on next slides) Occurs often when integrating results from different studies or omics (e.g. comparing transcriptomics and proteomics data)
  • 11. 10 DEGs for Parkinson‘s disease (PD) vs. KEGG PD pathway • DEGs in PD vs. control samples • KEGG Parkinson‘s disease pathway • Overlap OPA1 mediates mitochondrial fusion NR4A2 mutations have been associated with familial PD
  • 12. 11 DEGs for PD vs. exocytosis regulation pathway • DEGs in PD vs. control samples • Regulation of exocytosis process (Gene Ontology) • Overlap
  • 13. 12 Summary • EnrichNet provides a new means to score and interpret gene/protein set associations by exploiting functional information captured in the graph structure of molecular networks • New functional associations are identified and sub-network visualizations enable a biological interpretation on the level of single molecular interactions
  • 14. 13 Availability Software, tutorials and examples freely available at: www.enrichnet.org We acknowledge support by:
  • 15. 14 References References 1. E. Glaab, A. Baudot, N. Krasnogor, R. Schneider, A. Valencia. EnrichNet: network-based gene set enrichment analysis, Bioinformatics, 28(18):i451-i457, 2012 2. E. Glaab, R. Schneider, PathVar: analysis of gene and protein expression variance in cellular pathways using microarray data, Bioinformatics, 28(3):446-447, 2012 3. E. Glaab, J. Bacardit, J. M. Garibaldi, N. Krasnogor, Using rule-based machine learning for candidate disease gene prioritization and sample classification of cancer gene expression data, PLoS ONE, 7(7):e39932, 2012 4. E. Glaab, A. Baudot, N. Krasnogor, A. Valencia. TopoGSA: network topological gene set analysis, Bioinformatics, 26(9):1271-1272, 2010 5. E. Glaab, A. Baudot, N. Krasnogor, A. Valencia. Extending pathways and processes using molecular interaction networks to analyse cancer genome data, BMC Bioinformatics, 11(1):597, 2010 6. H. O. Habashy, D. G. Powe, E. Glaab, N. Krasnogor, J. M. Garibaldi, E. A. Rakha, G. Ball, A. R Green, C. Caldas, I. O. Ellis, RERG (Ras-related and oestrogen-regulated growth-inhibitor) expression in breast cancer: A marker of ER-positive luminal-like subtype, Breast Cancer Research and Treatment, 128(2):315-326, 2011 7. E. Glaab, J. M. Garibaldi and N. Krasnogor. ArrayMining: a modular web-application for microarray analysis combining ensemble and consensus methods with cross-study normalization, BMC Bioinformatics,10:358, 2009 8. E. Glaab, J. M. Garibaldi, N. Krasnogor. Learning pathway-based decision rules to classify microarray cancer samples, German Conference on Bioinformatics 2010, Lecture Notes in Informatics (LNI), 173, 123-134 9. E. Glaab, J. M. Garibaldi and N. Krasnogor. VRMLGen: An R-package for 3D Data Visualization on the Web, Journal of Statistical Software, 36(8),1-18, 2010