SlideShare une entreprise Scribd logo
1  sur  28
Increased Expressivity of Gene
    Ontology Annotations
  Huntley RP, Harris MA, Alam-Faruque Y, Carbon SJ,
   Dietze H, Dimmer E, Foulger R, Hill DP, Khodiyar V,
  Lock A, Lomax J, Lovering RC, Mungall CJ, Mutowo-
     Muellenet P, Sawford T, Van Auken K, Wood V
The Gene Ontology
      • A vocabulary of 37,500* distinct, connected
        descriptions that can be applied to gene
        products
                           gene 1




                           gene 2




      • That’s a lot…
              – How big is the space of possible descriptions?

*April 2013
Current descriptions miss details
• Author:
   – LMTK1 (Aatk) can negatively control axonal outgrowth in
     cortical neurons by regulating Rab11A activity in a Cdk5-
     dependent manner
          – http://www.ncbi.nlm.nih.gov/pubmed/22573681
• GO:
   – Aatk: GO:0030517 negative regulation of axon extension

• GO terms will always be a subset of total set of possible
  descriptions
   – We shouldn’t attempt to make a term for everything
• T63 Toxic effect of contact with venomous
  animals and plants

                     Term from ICD-10, a
                     hierarchical medical
                     billing code system
                     use to ‘annotate’
                     patient records
• T63 Toxic effect of contact with venomous
  animals and plants
  – T63.611 Toxic effect of contact with Portugese
    Man-o-war, accidental (unintentional)
• T63 Toxic effect of contact with venomous
  animals and plants
  – T63.611 Toxic effect of contact with Portugese
    Man-o-war, accidental (unintentional)
  – T63.612 Toxic effect of contact with Portugese
    Man-o-war, intentional self-harm
• T63 Toxic effect of contact with venomous
  animals and plants
  – T63.611 Toxic effect of contact with Portugese
    Man-o-war, accidental (unintentional)
  – T63.612 Toxic effect of contact with Portugese
    Man-o-war, intentional self-harm
  – T63.613 Toxic effect of contact with Portugese
    Man-o-war, assault
• T63 Toxic effect of contact with venomous
  animals and plants
  – T63.611 Toxic effect of contact with Portugese
    Man-o-war, accidental (unintentional)
  – T63.612 Toxic effect of contact with Portugese
    Man-o-war, intentional self-harm
  – T63.613 Toxic effect of contact with Portugese
    Man-o-war, assault
     • T63.613A Toxic effect of contact with Portugese Man-
       o-war, assault, initial encounter
     • T63.613D Toxic effect of contact with Portugese Man-
       o-war, assault, subsequent encounter
     • T63.613S Toxic effect of contact with Portugese Man-
       o-war, assault, sequela
Post-composition
    • Curators need to be able to compose their
      complex descriptions from simpler
      descriptions (terms) at the time of annotation

    •  GO annotation extensions
             • Introduced with Gene Association Format (GAF) v2
                 – Also supported in GPAD
             • Has underlying OWL description-logic model


http://www.geneontology.org/GO.format.gaf-2_0.shtml
“Classic” annotation model
    • Gene Association Format (GAF) v1
        – Simple pairwise model
        – Each gene product is associated with an (ordered) set
          of descriptions
             • Where each description == a GO term




http://www.geneontology.org/GO.format.gaf-1_0.shtml
GO annotation extensions
    • Gene Association Format (GAF) v1
        – Simple pairwise model
        – Each gene product is associated with an (ordered) set of
          descriptions
             • Where each description == a GO term
    • Gene Association Format (GAF) v2 (and GPAD)
        – Each gene product is (still) associated with an (ordered) set of
          descriptions
        – Each description is a GO term plus zero or more relationships
          to other entities
             • Entities from GO, other ontologies, databases
             • Description is an OWL anonymous class expression (aka description)
http://www.geneontology.org/GO.format.gaf-2_0.shtml
“Classic” GO annotations are
                         unconnected
                                                                                positive regulation of
                             protein                                           transcription from pol II
                          localization to                   pap1               promoter in response to
  sty1                   nucleus[GO:003                                                oxidative
                                                                                 stress[GO:0036091]
                              4504]

                           cellular response
                          to oxidative stress
                            [GO:0034599]



DB        Object            Term                Ev    Ref                 ..
PomBase   sty1              GO:0034504          IMP   PMID:9585505   ..   ..                               ..
          SPAC24B11.06c

PomBase   sty1              GO:0034599          IMP   PMID:9585505   ..   ..
          SPAC24B11.06c

PomBase   pap1              GO:0036091          IMP   PMID:9585505        ..
          SPAC1783.07c
Now with annotation extensions
                                                                                positive regulation of
                             protein                  cellular response        transcription from pol II
                          localization to            to oxidative stress       promoter in response to
                         nucleus[GO:003                [GO:0034599]                    oxidative
                                                                                 stress[GO:0036091]
                              4504]
                                                   happens
                                                   during

     sty1                                                     pap1
                                                   has
                             <anonymous
                                                   input                          <anonymous     has regulation
                             description>                                         description>
                                                                                                 target


DB        Object            Term              Ev       Ref                 Extension
PomBase   sty1              GO:0034504        IMP      PMID:9585505   ..   happens_during(GO:0034599),       ..
          SPAC24B11.06c     protein                                        has_input(SPAC1783.07c)
                            localization to
                            nucleus

PomBase   pap1              GO:0036091        IMP      PMID:9585505        has_reulation_target(…)
          SPAC1783.07c
PomBase web interface – sty1




http://www.pombase.org/spombe/result/SPAC24B11.06c
pap1




http://www.pombase.org/spombe/result/SPAC1783.07c
Where do I get them?
• Download
  – http://geneontology.org/GO.downloads.annotations.shtml
      • MGI (22,000)
      • GOA Human (4,200)
      • PomBase (1,588)
• Search and Browsing
  – Cross-species
      • AmiGO 2 – http://amigo2.berkeleybop.org - poster#57
      • QuickGO (later this year) - http://www.ebi.ac.uk/QuickGO/
  – MOD interfaces
      • PomBase – http://bombase.org
Query tool support: AmiGO 2
                                       Annotation extensions make use
                                       of other ontologies
                                       • CHEBI
                                       • CL – cell types
                                       • Uberon – metazoan anatomy
                                       • MA – mouse anatomy
                                       • EMAP – mouse anatomy
                                       • ….




                                  CL
– http://amigo2.berkeleybop.org
CL, Uberon
– http://amigo2.berkeleybop.org
CL, Uberon
– http://amigo2.berkeleybop.org
Curation tool support
• Supported in
  – Protein2GO (GOA, WormBase) [poster#97]
  – CANTO (PomBase) [poster#110]
  – MGI curation tool
Analysis tool support
• Currently: Enrichment tools do not yet support
  annotation extensions
  – Annotation extensions can be folded into an
    analysis ontology - http://galaxy.berkeleybop.org
• Future: Analysis tools can use extended
  annotations to their benefit
  – E.g. account for other modes of regulation in their
    model
  – Tool developers: contact us!
Challenge: pre vs post composition
  • Curator question: do I…
       – Request a pre-composed term via TermGenie[*]?
       – Post-compose using annotation extensions?




See Heiko’s TermGenie talk tomorrow & poster #33
Challenge: pre vs post composition
    • Curator question: do I…
         – Request a pre-composed term via TermGenie?
         – Post-compose using annotation extensions?

    • From a computational                                     protein localization to
                                                               nucleus[GO:0034504]
      perspective:
         – It doesn’t matter, we’re                                     ≡
           using OWL                                                           end_location
                                                          protein
         – 40% of GO terms have OWL                     localization    ⊓
                                                                               Nucleus
                                                                             [GO:0005634
           equivalence axioms                          [GO:0008104]               ]


http://code.google.com/p/owltools/wiki/AnnotationExtensionFolding
Curation Challenges
• Manual Curation
  – Fewer terms, but more degrees of freedom
  – Curator consistency
     • OWL constraints can help
• Automated annotation
  – Phylogenetic propagation
  – Text processing and NLP
Similar approaches and future
               directions
• Post-composition has been used extensively
  for phenotype annotation
  – ZFIN [poster#95]
  – Phenoscape [next talk]
• Future:
  – A more expressive model that bridges GO with
    pathway representations
Conclusions
• Description space is huge
  – Context is important
  – Not appropriate to make a term for everything
  – OWL allows us to mix and match pre and post
    composition
• Number of extension annotations is growing
• Annotation extensions represent untapped
  opportunity for tool developers
Acknowledgments
• GO Consortium, model organism and UniProtKB curators
• GO Directors
• PomBase developers:
   – Mark McDowell, Kim Rutherford

• Funding
   –   GO Consortium NIH 5P41HG002273-09
   –   UniProtKB GOA NHGRI U41HG006104-03
   –   British Heart Foundation grant SP/07/007/23671
   –   Kidney Research UK RP26/2008
   –   PomBase - Wellcome Trust WT090548MA
   –   MGD NHGRI HG000330

Contenu connexe

Similaire à Increased Expressivity of Gene Ontology Annotations - Biocuration 2013

Cross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene OntologyCross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene OntologyChris Mungall
 
Workflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaWorkflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaBarry Hardy
 
PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...Valerie Wood
 
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...AvactaLifeSciences
 
Translating research data into Gene Ontology annotations
Translating research data into Gene Ontology annotationsTranslating research data into Gene Ontology annotations
Translating research data into Gene Ontology annotationsPascale Gaudet
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesChris Mungall
 
SureChEMBL patent annotations in Open PHACTS
SureChEMBL patent annotations in Open PHACTSSureChEMBL patent annotations in Open PHACTS
SureChEMBL patent annotations in Open PHACTSGeorge Papadatos
 
The Language of the Gene Ontology
The Language of the Gene OntologyThe Language of the Gene Ontology
The Language of the Gene Ontologyrobertstevens65
 
Inferring microbial gene function from evolution of synonymous codon usage bi...
Inferring microbial gene function from evolution of synonymous codon usage bi...Inferring microbial gene function from evolution of synonymous codon usage bi...
Inferring microbial gene function from evolution of synonymous codon usage bi...Fran Supek
 
caron.ppt educate the patient on the uses
caron.ppt educate the patient on the usescaron.ppt educate the patient on the uses
caron.ppt educate the patient on the usesomar97227
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...Davide Chicco
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issuesDongyan Zhao
 
Autophagy Research Focus by Proteintech
Autophagy Research Focus by ProteintechAutophagy Research Focus by Proteintech
Autophagy Research Focus by ProteintechProteintech Group
 
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...Scintica Instrumentation
 

Similaire à Increased Expressivity of Gene Ontology Annotations - Biocuration 2013 (20)

Cross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene OntologyCross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene Ontology
 
Workflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaWorkflows supporting drug discovery against malaria
Workflows supporting drug discovery against malaria
 
PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...
 
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
 
Translating research data into Gene Ontology annotations
Translating research data into Gene Ontology annotationsTranslating research data into Gene Ontology annotations
Translating research data into Gene Ontology annotations
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
 
Xerox2009
Xerox2009Xerox2009
Xerox2009
 
RML NCBI Resources
RML NCBI ResourcesRML NCBI Resources
RML NCBI Resources
 
SureChEMBL patent annotations in Open PHACTS
SureChEMBL patent annotations in Open PHACTSSureChEMBL patent annotations in Open PHACTS
SureChEMBL patent annotations in Open PHACTS
 
The Language of the Gene Ontology
The Language of the Gene OntologyThe Language of the Gene Ontology
The Language of the Gene Ontology
 
PHd defense presentation Final RIVES
PHd defense presentation Final RIVESPHd defense presentation Final RIVES
PHd defense presentation Final RIVES
 
Chicago stats talk
Chicago stats talkChicago stats talk
Chicago stats talk
 
Paprica course
Paprica coursePaprica course
Paprica course
 
Inferring microbial gene function from evolution of synonymous codon usage bi...
Inferring microbial gene function from evolution of synonymous codon usage bi...Inferring microbial gene function from evolution of synonymous codon usage bi...
Inferring microbial gene function from evolution of synonymous codon usage bi...
 
caron.ppt educate the patient on the uses
caron.ppt educate the patient on the usescaron.ppt educate the patient on the uses
caron.ppt educate the patient on the uses
 
Ismb2009
Ismb2009Ismb2009
Ismb2009
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
 
Autophagy Research Focus by Proteintech
Autophagy Research Focus by ProteintechAutophagy Research Focus by Proteintech
Autophagy Research Focus by Proteintech
 
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
 

Plus de Chris Mungall

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxChris Mungall
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesChris Mungall
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxChris Mungall
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)Chris Mungall
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupChris Mungall
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Chris Mungall
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeChris Mungall
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in UberonChris Mungall
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)Chris Mungall
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Chris Mungall
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...Chris Mungall
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributionsChris Mungall
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyChris Mungall
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyChris Mungall
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodelChris Mungall
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Chris Mungall
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Chris Mungall
 

Plus de Chris Mungall (20)

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite Group
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of life
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation Ontology
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodel
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
ENVO GSC 2015
ENVO GSC 2015ENVO GSC 2015
ENVO GSC 2015
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
 

Dernier

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Dernier (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

Increased Expressivity of Gene Ontology Annotations - Biocuration 2013

  • 1. Increased Expressivity of Gene Ontology Annotations Huntley RP, Harris MA, Alam-Faruque Y, Carbon SJ, Dietze H, Dimmer E, Foulger R, Hill DP, Khodiyar V, Lock A, Lomax J, Lovering RC, Mungall CJ, Mutowo- Muellenet P, Sawford T, Van Auken K, Wood V
  • 2. The Gene Ontology • A vocabulary of 37,500* distinct, connected descriptions that can be applied to gene products gene 1 gene 2 • That’s a lot… – How big is the space of possible descriptions? *April 2013
  • 3.
  • 4. Current descriptions miss details • Author: – LMTK1 (Aatk) can negatively control axonal outgrowth in cortical neurons by regulating Rab11A activity in a Cdk5- dependent manner – http://www.ncbi.nlm.nih.gov/pubmed/22573681 • GO: – Aatk: GO:0030517 negative regulation of axon extension • GO terms will always be a subset of total set of possible descriptions – We shouldn’t attempt to make a term for everything
  • 5. • T63 Toxic effect of contact with venomous animals and plants Term from ICD-10, a hierarchical medical billing code system use to ‘annotate’ patient records
  • 6. • T63 Toxic effect of contact with venomous animals and plants – T63.611 Toxic effect of contact with Portugese Man-o-war, accidental (unintentional)
  • 7. • T63 Toxic effect of contact with venomous animals and plants – T63.611 Toxic effect of contact with Portugese Man-o-war, accidental (unintentional) – T63.612 Toxic effect of contact with Portugese Man-o-war, intentional self-harm
  • 8. • T63 Toxic effect of contact with venomous animals and plants – T63.611 Toxic effect of contact with Portugese Man-o-war, accidental (unintentional) – T63.612 Toxic effect of contact with Portugese Man-o-war, intentional self-harm – T63.613 Toxic effect of contact with Portugese Man-o-war, assault
  • 9. • T63 Toxic effect of contact with venomous animals and plants – T63.611 Toxic effect of contact with Portugese Man-o-war, accidental (unintentional) – T63.612 Toxic effect of contact with Portugese Man-o-war, intentional self-harm – T63.613 Toxic effect of contact with Portugese Man-o-war, assault • T63.613A Toxic effect of contact with Portugese Man- o-war, assault, initial encounter • T63.613D Toxic effect of contact with Portugese Man- o-war, assault, subsequent encounter • T63.613S Toxic effect of contact with Portugese Man- o-war, assault, sequela
  • 10. Post-composition • Curators need to be able to compose their complex descriptions from simpler descriptions (terms) at the time of annotation •  GO annotation extensions • Introduced with Gene Association Format (GAF) v2 – Also supported in GPAD • Has underlying OWL description-logic model http://www.geneontology.org/GO.format.gaf-2_0.shtml
  • 11. “Classic” annotation model • Gene Association Format (GAF) v1 – Simple pairwise model – Each gene product is associated with an (ordered) set of descriptions • Where each description == a GO term http://www.geneontology.org/GO.format.gaf-1_0.shtml
  • 12. GO annotation extensions • Gene Association Format (GAF) v1 – Simple pairwise model – Each gene product is associated with an (ordered) set of descriptions • Where each description == a GO term • Gene Association Format (GAF) v2 (and GPAD) – Each gene product is (still) associated with an (ordered) set of descriptions – Each description is a GO term plus zero or more relationships to other entities • Entities from GO, other ontologies, databases • Description is an OWL anonymous class expression (aka description) http://www.geneontology.org/GO.format.gaf-2_0.shtml
  • 13. “Classic” GO annotations are unconnected positive regulation of protein transcription from pol II localization to pap1 promoter in response to sty1 nucleus[GO:003 oxidative stress[GO:0036091] 4504] cellular response to oxidative stress [GO:0034599] DB Object Term Ev Ref .. PomBase sty1 GO:0034504 IMP PMID:9585505 .. .. .. SPAC24B11.06c PomBase sty1 GO:0034599 IMP PMID:9585505 .. .. SPAC24B11.06c PomBase pap1 GO:0036091 IMP PMID:9585505 .. SPAC1783.07c
  • 14. Now with annotation extensions positive regulation of protein cellular response transcription from pol II localization to to oxidative stress promoter in response to nucleus[GO:003 [GO:0034599] oxidative stress[GO:0036091] 4504] happens during sty1 pap1 has <anonymous input <anonymous has regulation description> description> target DB Object Term Ev Ref Extension PomBase sty1 GO:0034504 IMP PMID:9585505 .. happens_during(GO:0034599), .. SPAC24B11.06c protein has_input(SPAC1783.07c) localization to nucleus PomBase pap1 GO:0036091 IMP PMID:9585505 has_reulation_target(…) SPAC1783.07c
  • 15. PomBase web interface – sty1 http://www.pombase.org/spombe/result/SPAC24B11.06c
  • 17. Where do I get them? • Download – http://geneontology.org/GO.downloads.annotations.shtml • MGI (22,000) • GOA Human (4,200) • PomBase (1,588) • Search and Browsing – Cross-species • AmiGO 2 – http://amigo2.berkeleybop.org - poster#57 • QuickGO (later this year) - http://www.ebi.ac.uk/QuickGO/ – MOD interfaces • PomBase – http://bombase.org
  • 18. Query tool support: AmiGO 2 Annotation extensions make use of other ontologies • CHEBI • CL – cell types • Uberon – metazoan anatomy • MA – mouse anatomy • EMAP – mouse anatomy • …. CL – http://amigo2.berkeleybop.org
  • 21. Curation tool support • Supported in – Protein2GO (GOA, WormBase) [poster#97] – CANTO (PomBase) [poster#110] – MGI curation tool
  • 22. Analysis tool support • Currently: Enrichment tools do not yet support annotation extensions – Annotation extensions can be folded into an analysis ontology - http://galaxy.berkeleybop.org • Future: Analysis tools can use extended annotations to their benefit – E.g. account for other modes of regulation in their model – Tool developers: contact us!
  • 23. Challenge: pre vs post composition • Curator question: do I… – Request a pre-composed term via TermGenie[*]? – Post-compose using annotation extensions? See Heiko’s TermGenie talk tomorrow & poster #33
  • 24. Challenge: pre vs post composition • Curator question: do I… – Request a pre-composed term via TermGenie? – Post-compose using annotation extensions? • From a computational protein localization to nucleus[GO:0034504] perspective: – It doesn’t matter, we’re ≡ using OWL end_location protein – 40% of GO terms have OWL localization ⊓ Nucleus [GO:0005634 equivalence axioms [GO:0008104] ] http://code.google.com/p/owltools/wiki/AnnotationExtensionFolding
  • 25. Curation Challenges • Manual Curation – Fewer terms, but more degrees of freedom – Curator consistency • OWL constraints can help • Automated annotation – Phylogenetic propagation – Text processing and NLP
  • 26. Similar approaches and future directions • Post-composition has been used extensively for phenotype annotation – ZFIN [poster#95] – Phenoscape [next talk] • Future: – A more expressive model that bridges GO with pathway representations
  • 27. Conclusions • Description space is huge – Context is important – Not appropriate to make a term for everything – OWL allows us to mix and match pre and post composition • Number of extension annotations is growing • Annotation extensions represent untapped opportunity for tool developers
  • 28. Acknowledgments • GO Consortium, model organism and UniProtKB curators • GO Directors • PomBase developers: – Mark McDowell, Kim Rutherford • Funding – GO Consortium NIH 5P41HG002273-09 – UniProtKB GOA NHGRI U41HG006104-03 – British Heart Foundation grant SP/07/007/23671 – Kidney Research UK RP26/2008 – PomBase - Wellcome Trust WT090548MA – MGD NHGRI HG000330

Notes de l'éditeur

  1. 10 mins. GAF2.0
  2. 1
  3. Sweet spot in a large galaxy
  4. Not ad-hoc – OWL description
  5. Key point: logically equivalent to an annotation to a term in the &lt;anon desc&gt; box, with the same links out.