Go pathway-interaction-integration

•Download as PPTX, PDF•

1 like•547 views

Chris Mungall

The GO was originally intended to integrate databases How are we doing? Interoperability of genomic databases is limited by this lack of progress, and it is this major obstacle that the Gene Ontology (GO) Consortium was formed to address Gene Ontology: Tool for theUniﬁcationofBiology. Nat Genet 2000 SGD FB GOA

GO The GO was originally intended to integrate databases How are we doing? Not as well as we could! GO SGD FB GOA Pathway Commons IMEX Reactome Cyc … BioGRID Intact …

Integration enhances analyses and reduces workload Division of labor leave specialized curation to specialized systems biology databases but data needs to be re-combined to prevent siloing GO is an invaluable single-stop shop for term enrichment etc Can we quantify how integrating with systems biology databases helps users? Yes! We can do the experiment: GO term enrichment analysis on all MolSigDB withReactome annotations Also include Reactome inputs/outputs, not currently in GOA withoutReactomeannotations

Integration enhances analyses GOA+R: Many p-values will significantly improved Recapitulated biologically valid results that would have been suppressed had one single resource been used Examples: Genes down-regulated in Alzheimers

How are we currently integrating systems biology datasets? Interaction data Currently Intact, soon IMEX “protein binding” and “self-protein binding” only (+with) Pathway data Currently Reactomeonly Loses much of what is in Reactome E,g,inputs and outputs Manually curated GO<->Reactome links incomplete not always to the most specific term labor-intensive become stale over time other pathway databases? This can be improved!

Automating integration using cross-product definitions – pathway databases [Term] id: GO:0015871 name: choline transport intersection_of: GO:0006810 ! transport intersection_of:results_in_transport_ofCHEBI:15354 ! choline

Automating integration using cross-products – pathway databases We can also automatically map: catalysis terms [165*] transport [373] binding [133] phosphorylation and other modifications metabolism [278] signaling … All this relies on different cross-product files Any pathway database that exports BioPax-OWL can be used E.ghumancyc, mousecyc, pathwaycommons, … *Numbers for Reactome-human

Automating integration using cross-products – interaction databases FIGF VEGFR binds has_function is_a [Term] id: GO:0043184 name: vascular endothelial growth factor receptor 2 binding intersection_of: GO:0005488 ! binding intersection_of:results_in_binding_ofPRO:000002112! VEGFR 2

Automated Integration: Results Reactome Evaluation in progress Many manually assigned equivalencies recapitulated Inferred equivalencies differed in some cases sometimes better than manually assigned sometimes required info not in biopax export ongoing discussions BioGrid not evaluated (all trivial) inferred annotations improve some enrichment results E.g. Brentani angiogenesis gene sets, increased enrichment for VEGFR binding Obvious but useful as proof of concept

Conclusions and future work We can be more efficient: Coordinate with systems bio databases to divide labor Prevent siloing through semi-automated integration GO acts as a high-level ‘window’ on systems biology databases Still to be done: Make integration tool production-ready Reconcile existing mis-alignments, particularly signaling highly inconsistent between GO and Reactome Explore open questions – e.g. auto-generate terms? Finish cross-products, they are vital particular PRO, CHEBI

Viewers also liked

Short Introduction of software engineering for bioinformatics 丈宮本

20160530 journal club_jqoJavier Quílez Oliete

Kegg database resources innocent87

Functional And Pathway Analysis 2010Stewart MacArthur

Protein DatabaseDepartment of Bioinformatics, Noorul Islam College of Arts and Science, Kumaracoil

B.Sc. Biochem II Biomolecule I U 3.1 Structure of ProteinsRai University

DavidFrancesco Mattia Mancuso

Protein structure classificationMalla Reddy College of Pharmacy

Biological databasesMalla Reddy College of Pharmacy

PROTEIN DATABASEnaveed ul mushtaq

databases in bioinformaticsnadeem akhter

Protein classificationDr. Aamir Ali Khan

Classification and properties of proteinMark Philip Besana

Viewers also liked (13)

Short Introduction of software engineering for bioinformatics

20160530 journal club_jqo

Kegg database resources

Functional And Pathway Analysis 2010

Protein Database

B.Sc. Biochem II Biomolecule I U 3.1 Structure of Proteins

David

Protein structure classification

Biological databases

PROTEIN DATABASE

databases in bioinformatics

Protein classification

Classification and properties of protein

Similar to Go pathway-interaction-integration

SageCite demonstrator overviewmonicaduke

BioAssay Express: Creating and exploiting assay metadataPhilip Cheung

NetBioSIG2013-Talk Robin Haw Alexander Pico

Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDBDinesh Barupal

The Gene Ontology & Gene Ontology Annotation resourcesMelanie Courtot

Pathways2GO: Converting BioPax pathways to GO-CAMsBenjamin Good

bioinformatic.pptxRitikaChoudhary57

IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...IRJET Journal

Ecocyc databaseShiv Kumar

Scientific Workflows: what do we have, what do we miss?Paolo Romano

ChEBI and genome scale metabolic reconstructionsNeil Swainston

GiToolschristian.perez

The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...Neil Swainston

Haider Embrace Bosc2008bosc_2008

Ieee projects 2012 2013 - Bio InformaticsK Sundaresh Ka

Integrating Pathway Databases with Gene Ontology Causal Activity ModelsBenjamin Good

bioinformatics enabling knowledge generation from agricultural omics dataInternational Institute of Tropical Agriculture

FYP reportChong Yee Gan

Bio-protocol4446.GO_KEGG_RICE.pdfssuserb500f8

Bio4jPablo Pareja Tobes

Similar to Go pathway-interaction-integration (20)

SageCite demonstrator overview

BioAssay Express: Creating and exploiting assay metadata

NetBioSIG2013-Talk Robin Haw

Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDB

The Gene Ontology & Gene Ontology Annotation resources

Pathways2GO: Converting BioPax pathways to GO-CAMs

bioinformatic.pptx

IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...

Ecocyc database

Scientific Workflows: what do we have, what do we miss?

ChEBI and genome scale metabolic reconstructions

GiTools

The Subliminal Toolbox: automating steps in the reconstruction of metabolic n...

Haider Embrace Bosc2008

Ieee projects 2012 2013 - Bio Informatics

Integrating Pathway Databases with Gene Ontology Causal Activity Models

bioinformatics enabling knowledge generation from agricultural omics data

FYP report

Bio-protocol4446.GO_KEGG_RICE.pdf

Bio4j

More from Chris Mungall

MADICES Mungall 2022.pptxChris Mungall

Scaling up semantics; lessons learned across the life sciencesChris Mungall

LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall

Ontology Access Kit_ Workshop Intro Slides.pptxChris Mungall

LinkML Intro (for Monarch devs)Chris Mungall

LinkML presentation to Yosemite GroupChris Mungall

Experiences in the biosciences with the open biological ontologies foundry an...Chris Mungall

All together now: piecing together the knowledge graph of lifeChris Mungall

Collaboratively Creating the Knowledge Graph of LifeChris Mungall

Representation of kidney structures in UberonChris Mungall

SparqlProg (BioHackathon 2019)Chris Mungall

Ontology Development Kit: Bio-Ontologies 2019Chris Mungall

US2TS: Reasoning over multiple open bio-ontologies to make machines and human...Chris Mungall

Uberon: opening up to community contributionsChris Mungall

Modeling exposure events and adverse outcome pathways using ontologiesChris Mungall

Causal reasoning using the Relation OntologyChris Mungall

US2TS presentation on Gene OntologyChris Mungall

Introduction to the BioLink datamodelChris Mungall

Computing on Phenotypes AMP 2015Chris Mungall

ENVO GSC 2015Chris Mungall

More from Chris Mungall (20)

MADICES Mungall 2022.pptx

Scaling up semantics; lessons learned across the life sciences

LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO

Ontology Access Kit_ Workshop Intro Slides.pptx

LinkML Intro (for Monarch devs)

LinkML presentation to Yosemite Group

Experiences in the biosciences with the open biological ontologies foundry an...

All together now: piecing together the knowledge graph of life

Collaboratively Creating the Knowledge Graph of Life

Representation of kidney structures in Uberon

SparqlProg (BioHackathon 2019)

Ontology Development Kit: Bio-Ontologies 2019

US2TS: Reasoning over multiple open bio-ontologies to make machines and human...

Uberon: opening up to community contributions

Modeling exposure events and adverse outcome pathways using ontologies

Causal reasoning using the Relation Ontology

US2TS presentation on Gene Ontology

Introduction to the BioLink datamodel

Computing on Phenotypes AMP 2015

ENVO GSC 2015

Go pathway-interaction-integration

1. Integration of GO, Pathway data and Interaction data Chris Mungall Peter D’Eustachio

2. The GO was originally intended to integrate databases How are we doing? Interoperability of genomic databases is limited by this lack of progress, and it is this major obstacle that the Gene Ontology (GO) Consortium was formed to address Gene Ontology: Tool for theUniﬁcationofBiology. Nat Genet 2000 SGD FB GOA

3. GO The GO was originally intended to integrate databases How are we doing? Not as well as we could! GO SGD FB GOA Pathway Commons IMEX Reactome Cyc … BioGRID Intact …

4. Integration enhances analyses and reduces workload Division of labor leave specialized curation to specialized systems biology databases but data needs to be re-combined to prevent siloing GO is an invaluable single-stop shop for term enrichment etc Can we quantify how integrating with systems biology databases helps users? Yes! We can do the experiment: GO term enrichment analysis on all MolSigDB withReactome annotations Also include Reactome inputs/outputs, not currently in GOA withoutReactomeannotations

5. Integration enhances analyses GOA+R: Many p-values will significantly improved Recapitulated biologically valid results that would have been suppressed had one single resource been used Examples: Genes down-regulated in Alzheimers

6. How are we currently integrating systems biology datasets? Interaction data Currently Intact, soon IMEX “protein binding” and “self-protein binding” only (+with) Pathway data Currently Reactomeonly Loses much of what is in Reactome E,g,inputs and outputs Manually curated GO<->Reactome links incomplete not always to the most specific term labor-intensive become stale over time other pathway databases? This can be improved!

7. Automating integration using cross-product definitions – pathway databases [Term] id: GO:0015871 name: choline transport intersection_of: GO:0006810 ! transport intersection_of:results_in_transport_ofCHEBI:15354 ! choline

8. Automating integration using cross-products – pathway databases We can also automatically map: catalysis terms [165*] transport [373] binding [133] phosphorylation and other modifications metabolism [278] signaling … All this relies on different cross-product files Any pathway database that exports BioPax-OWL can be used E.ghumancyc, mousecyc, pathwaycommons, … *Numbers for Reactome-human

9. Automating integration using cross-products – interaction databases FIGF VEGFR binds has_function is_a [Term] id: GO:0043184 name: vascular endothelial growth factor receptor 2 binding intersection_of: GO:0005488 ! binding intersection_of:results_in_binding_ofPRO:000002112! VEGFR 2

10. Automated Integration: Results Reactome Evaluation in progress Many manually assigned equivalencies recapitulated Inferred equivalencies differed in some cases sometimes better than manually assigned sometimes required info not in biopax export ongoing discussions BioGrid not evaluated (all trivial) inferred annotations improve some enrichment results E.g. Brentani angiogenesis gene sets, increased enrichment for VEGFR binding Obvious but useful as proof of concept

11. Conclusions and future work We can be more efficient: Coordinate with systems bio databases to divide labor Prevent siloing through semi-automated integration GO acts as a high-level ‘window’ on systems biology databases Still to be done: Make integration tool production-ready Reconcile existing mis-alignments, particularly signaling highly inconsistent between GO and Reactome Explore open questions – e.g. auto-generate terms? Finish cross-products, they are vital particular PRO, CHEBI

Editor's Notes

The Gene Ontology was created as a response to the need to address the need for interoperability in genomic databases in the wake of the sequencing of the ﬁrst metazoan genomes. In the paper Gene Ontology: tool for the uniﬁcation of biology published nearly ten years ago, Ashburner et al state: Progress in the way that biologists describe and conceptualize the shared biological elements has not kept pace with sequencing . . . Interoperability of genomic databases is limited by this lack of progress, and it is this major obstacle that the Gene Ontology (GO) Consortium was formed to address [25].The GO has since become the de-facto terminological standard for functional annotation, and its success is evident in the popularity of GO-based class enrichment analyses. However, the intervening ten years have witnessed an explosion of interest in systems biology, with a concomitant increase in the number of databases providing information on interactions and pathways, including Reactome, Nature Signaling, PANTHER [26], BIND, BioGRID and HumanCyc (the EcoCyc metabolic pathway database preceded GO [27]). These databases each have their own individual data models and schemas, creating an interoperability problem. This has partly been mitigated by the adoption of BioPAX as a standard exchange format, which allows the aggregation of multiple pathway databases in single “one-stop shopping” warehouses, such as the Pathway Knowledge Base [28], Pathway Commons, and WikiPathways. However, the data is still only partially integrated, and if a researcher wishes to obtain a comprehensive view of a pathway they must still examine multiple records, in addition to GO annotations
loss in pathway db transition
their data models capture more.
also via col16
PRO does not yet have Ras etc

Go pathway-interaction-integration

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (13)

Similar to Go pathway-interaction-integration

Similar to Go pathway-interaction-integration (20)

More from Chris Mungall

More from Chris Mungall (20)

Go pathway-interaction-integration

Editor's Notes