SlideShare une entreprise Scribd logo
1  sur  40
Introduction to the Gene Ontology
and GO annotation resources
Rachael Huntley
UniProt-GOA
EBI
PAGXX
January 2012
2
Why do we need GO?
Reasons for the Gene Ontology
www.geneontology.org
• Inconsistency in English language
Inconsistency in English languauge
• Same name for different concepts
or
??
Cell
 Comparison is difficult – in particular across
species or across databases
Just one reason why the Gene Ontology (GO) is
is needed…
• Different names for the same concept
Eggplant
Aubergine
Brinjal
Melongene
Same for biological concepts
Reasons for the Gene Ontology
www.geneontology.org
• Inconsistency in English language
• Increasing amounts of biological data available
• Increasing amounts of biological data to come
Search on ‘DNA repair’...
get almost 65,000 results
Increasing amounts of biological data available
Expansion of sequence
information
Reasons for the Gene Ontology
www.geneontology.org
• Inconsistency in English language
• Large datasets need to be interpreted quickly
• Increasing amounts of biological data available
• Increasing amounts of biological data to come
• A way to capture
biological knowledge for
individual gene products
in a written and
computable form
The Gene Ontology
• A set of concepts
and their relationships
to each other arranged
as a hierarchy www.ebi.ac.uk/QuickGO
Less specific concepts
More specific concepts
The Concepts in GO
1. Molecular Function
2. Biological Process
3. Cellular Component
An elemental activity or task or job
• protein kinase activity
• insulin receptor activity
A commonly recognised series of events
• cell division
Where a gene product is located
• mitochondrion
• mitochondrial matrix
• mitochondrial inner membrane
Anatomy of a GO term
Unique identifierUnique identifier
Term nameTerm name
DefinitionDefinitionSynonymsSynonyms
Cross-referencesCross-references
Ontology structure
• Directed acyclic graph
Terms can have more than one parent
• Terms are linked by
relationships
is_a
part_of
regulates (and +/- regulates)
www.ebi.ac.uk/QuickGOoccurs_in
has_part
These relationships allow for complex analysis of large datasets
Reactome
http://www.geneontology.org
• Compile the ontologies
- currently over 35,000 terms
- constantly increasing and improving
• Annotate gene products using ontology terms
- around 30 groups provide annotations
• Provide a public resource of data and tools
- regular releases of annotations
- tools for browsing/querying annotations and editing the ontology
Aims of the GO project
GO Annotation
UniProt-Gene Ontology Annotation
(UniProt-GOA) database at the EBI
• Largest open-source contributor of annotations to GO
• Provides annotation for more than 397,000 species
• Our priority is to annotate the human proteome
A GO annotation is …
…a statement that a gene product;
1. has a particular molecular function
or is involved in a particular biological process
or is located within a certain cellular component
2. as determined by a particular method
3. as described in a particular reference
P00505
Accession Name GO ID GO term name Reference Evidence code
IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2
Electronic Annotation
Manual Annotation
UniProt-GOA incorporates
annotations made using two methods
• Quick way of producing large numbers of annotations
• Annotations use less-specific GO terms
• Time-consuming process producing lower numbers
of annotations
• Annotations tend to use very specific GO terms
• Only source of annotation for many non-model organism species
1. Mapping of external concepts to GO terms
e.g. InterPro2GO, UniProt Keyword2GO, Enzyme Commission2GO
Electronic annotation methods
GO:0004715 ; non-membrane spanning protein tyrosine kinase activity
Annotations are high-quality and have an explanation of the method (GO_REF)
Macaque
Mouse DogCow
Guinea PigChimpanzee Rat
Chicken
Ensembl compara
2. Automatic transfer of manual annotations to orthologs
...and more
e.g. Human
Arabidopsis
Rice
Brachypodium
Maize
Poplar
Grape
…and moreEnsembl compara
Electronic annotation methods
Manual annotation by GOA
High–quality, specific annotations made using:
• Full text peer-reviewed papers
• A range of evidence codes to categorise
the types of evidence found in a paper
e.g. IDA, IMP, IPI
http://www.ebi.ac.uk/GOA
* Includes manual annotations integrated from external model organism
and specialist groups
920,557Manual annotations*
110,247,289Electronic annotations
Dec 2011 Statistics
Number of annotations in UniProt-GOA
database
How to access and use
GO annotation data
Where can you find annotations?
UniProtKB
Ensembl
Entrez gene
GO Consortium website
Gene Association Files
17 column files containing all information for each annotation
http://www.ebi.ac.uk/GOA/downloads.html
UniProt-GOA website
Numerous species-specific files
GO browsers
http://www.ebi.ac.uk/QuickGO
The EBI's QuickGO browser
Search GO terms
or proteins
Find sets of
GO annotations
• Access gene product functional information
• Analyse high-throughput genomic or proteomic datasets
• Validation of experimental techniques
• Get a broad overview of a proteome
• Obtain functional information for novel gene products
How scientists use the GO
Some examples…
Term enrichment
• Most popular type of GO analysis
• Determines which GO terms are more often associated
with a specified list of genes/proteins compared with a
control list or rest of genome
• Many tools available to do this analysis
• User must decide which is best for their analysis
Numerous Third Party Tools
www.geneontology.org/GO.tools.shtml
pears on lw n3d ...
: Set_LW_n3d_5p_...
Colored by: Copy of Copy of C5_RMA (Defa...
Gene List: all genes (14010)
attacked
time
control
Puparial adhesion
Molting cycle
Hemocyanin
Defense response
Immune response
Response to stimulus
Toll regulated genes
JAK-STAT regulated genes
Immune response
Toll regulated genes
Amino acid catabolism
Lipid metobolism
Peptidase activity
Protein catabolism
Immune response
cted Gene Tree: pearson lw n3d ...
ch color classification: Set_LW_n3d_5p_...
Colored by: Copy of Copy of C5_RMA (Defa...
Gene List: all genes (14010)
Bregje Wertheim at the Centre for Evolutionary Genomics,
Department of Biology, UCL and Eugene Schuster Group, EBI.
MicroArray data analysis
Analysis of high-throughput genomic datasets
Validation of experimental techniques
(Cao et al., Journal of Proteome Research 2006)
Rat liver plasma membrane isolation
Annotating novel sequences
• Can use BLAST queries to find similar sequences with
GO annotation which can be transferred to the new sequence
• Two tools currently available;
AmiGO BLAST – searches the GO Consortium database
BLAST2GO – searches the NCBI database
Using the GO to provide a functional
overview for a large dataset
• Many GO analysis tools use GO slims to give a broad
overview of the dataset
• GO slims are cut-down versions of the GO and
contain a subset of the terms in the whole GO
• GO slims usually contain less-specialised GO terms
Slimming the GO using the ‘true path rule’
Many gene products are associated with a
large number of descriptive, leaf GO nodes:
Slimming the GO using the ‘true path rule’
…however annotations can be mapped up
to a smaller set of parent GO terms:
GO slims
or you can make your own using;
Custom slims are available for download;
http://www.geneontology.org/GO.slims.shtml
• AmiGO's GO slimmer
• QuickGO
http://www.ebi.ac.uk/QuickGO
http://amigo.geneontology.org/cgi-bin/amigo/slimmer
www.ebi.ac.uk/QuickGO
Map-up annotations
with GO slims
The EBI's QuickGO browser
Search GO terms
or proteins
Find sets of
GO annotations
Questions on how to use QuickGO?
Contact goa@ebi.ac.uk
The UniProt-GOA group
Curators:
Software developer:
Team leaders:
Emily Dimmer
Rachael Huntley
Rolf Apweiler
Email: goa@ebi.ac.uk
http://www.ebi.ac.uk/GOA
Claire O’Donovan
Tony Sawford
Yasmin Alam-Faruque
Prudence Mutowo
Acknowledgements
GO Consortium
Members of;
UniProtKB
InterPro
IntAct
HAMAP
Ensembl
Ensembl Genomes
Funding
National Human Genome Research Institute (NHGRI)
Kidney Research UK
British Heart Foundation
EMBL

Contenu connexe

Tendances

Tendances (20)

Gene Expression Omnibus (GEO)
Gene Expression Omnibus (GEO)Gene Expression Omnibus (GEO)
Gene Expression Omnibus (GEO)
 
Protein database
Protein databaseProtein database
Protein database
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
ENTREZ.ppt
ENTREZ.pptENTREZ.ppt
ENTREZ.ppt
 
Protein database
Protein databaseProtein database
Protein database
 
The European Nucleotide Archive
The European Nucleotide ArchiveThe European Nucleotide Archive
The European Nucleotide Archive
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion Phylogenetic Tree, types and Applicantion
Phylogenetic Tree, types and Applicantion
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Swiss pdb viewer
Swiss pdb viewerSwiss pdb viewer
Swiss pdb viewer
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Protein databases
Protein databasesProtein databases
Protein databases
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysis
 
SWISS-PROT
SWISS-PROTSWISS-PROT
SWISS-PROT
 
Multiple alignment
Multiple alignmentMultiple alignment
Multiple alignment
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
Kegg databse
Kegg databseKegg databse
Kegg databse
 

En vedette

UniProt & Ontologies
UniProt & OntologiesUniProt & Ontologies
UniProt & OntologiesEric Jain
 
Gene Ontology Project
Gene Ontology ProjectGene Ontology Project
Gene Ontology Projectvaibhavdeoda
 
La muerte y la tortura no es arte ni cultura
La muerte y la tortura no es arte ni culturaLa muerte y la tortura no es arte ni cultura
La muerte y la tortura no es arte ni culturaAchaku
 
UniProt and the Semantic Web
UniProt and the Semantic WebUniProt and the Semantic Web
UniProt and the Semantic WebChimezie Ogbuji
 
Gerontology presentation
Gerontology presentationGerontology presentation
Gerontology presentationStaton Noel
 
Protectora d'animals_Xènia, Malina i Gemma
Protectora d'animals_Xènia, Malina i GemmaProtectora d'animals_Xènia, Malina i Gemma
Protectora d'animals_Xènia, Malina i Gemmamgonellgomez
 
EMC Symmetrix Data at Rest Encryption - Detailed Review
EMC Symmetrix Data at Rest Encryption - Detailed Review EMC Symmetrix Data at Rest Encryption - Detailed Review
EMC Symmetrix Data at Rest Encryption - Detailed Review EMC
 
Day 7 reconstuction
Day 7 reconstuctionDay 7 reconstuction
Day 7 reconstuctionTravis Klein
 
Germansk mytologi og_verdensanskuelse_nor
Germansk mytologi og_verdensanskuelse_norGermansk mytologi og_verdensanskuelse_nor
Germansk mytologi og_verdensanskuelse_norSebastian Hübner
 
BPC: Do you have the right design?
BPC: Do you have the right design?BPC: Do you have the right design?
BPC: Do you have the right design?Brian Tyson
 

En vedette (20)

UniProt & Ontologies
UniProt & OntologiesUniProt & Ontologies
UniProt & Ontologies
 
Gene Ontology Project
Gene Ontology ProjectGene Ontology Project
Gene Ontology Project
 
La muerte y la tortura no es arte ni cultura
La muerte y la tortura no es arte ni culturaLa muerte y la tortura no es arte ni cultura
La muerte y la tortura no es arte ni cultura
 
Gerontology
GerontologyGerontology
Gerontology
 
UniProt and the Semantic Web
UniProt and the Semantic WebUniProt and the Semantic Web
UniProt and the Semantic Web
 
Gerontology presentation
Gerontology presentationGerontology presentation
Gerontology presentation
 
Gerontology
GerontologyGerontology
Gerontology
 
Ontology
OntologyOntology
Ontology
 
Old age home
Old age homeOld age home
Old age home
 
Protectora d'animals_Xènia, Malina i Gemma
Protectora d'animals_Xènia, Malina i GemmaProtectora d'animals_Xènia, Malina i Gemma
Protectora d'animals_Xènia, Malina i Gemma
 
Valentine & Kebartas
Valentine & KebartasValentine & Kebartas
Valentine & Kebartas
 
EMC Symmetrix Data at Rest Encryption - Detailed Review
EMC Symmetrix Data at Rest Encryption - Detailed Review EMC Symmetrix Data at Rest Encryption - Detailed Review
EMC Symmetrix Data at Rest Encryption - Detailed Review
 
Informe general
Informe generalInforme general
Informe general
 
Day 7 reconstuction
Day 7 reconstuctionDay 7 reconstuction
Day 7 reconstuction
 
Private cloud day session 5 a solution for private cloud security
Private cloud day session 5 a solution for private cloud securityPrivate cloud day session 5 a solution for private cloud security
Private cloud day session 5 a solution for private cloud security
 
Germansk mytologi og_verdensanskuelse_nor
Germansk mytologi og_verdensanskuelse_norGermansk mytologi og_verdensanskuelse_nor
Germansk mytologi og_verdensanskuelse_nor
 
BPC: Do you have the right design?
BPC: Do you have the right design?BPC: Do you have the right design?
BPC: Do you have the right design?
 
2015 day 4
2015 day 42015 day 4
2015 day 4
 
Monopoly types
Monopoly typesMonopoly types
Monopoly types
 
Formulario clientes
Formulario clientesFormulario clientes
Formulario clientes
 

Similaire à UniProt-GOA

Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesMelanie Courtot
 
TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...Phoenix Bioinformatics
 
Lock - PomBase community curation
Lock - PomBase community curationLock - PomBase community curation
Lock - PomBase community curationPascale Gaudet
 
Designing a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardDesigning a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardEMBL-ABR
 
ICAR2016 TAIR talk
ICAR2016 TAIR talkICAR2016 TAIR talk
ICAR2016 TAIR talkDonghui Li
 
Ontology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation WorkflowOntology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation WorkflowTrish Whetzel
 
NCBO Tools and Web services
NCBO Tools and Web servicesNCBO Tools and Web services
NCBO Tools and Web servicesTrish Whetzel
 
Building and Using Ontologies to do biology
Building and Using Ontologies to do biologyBuilding and Using Ontologies to do biology
Building and Using Ontologies to do biologyrobertstevens65
 
Gene Ontology WormBase Workshop International Worm Meeting 2015
Gene Ontology WormBase Workshop International Worm Meeting 2015Gene Ontology WormBase Workshop International Worm Meeting 2015
Gene Ontology WormBase Workshop International Worm Meeting 2015raymond91105
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesMonica Munoz-Torres
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
 
Enabling Semantically Aware Software Applications
Enabling Semantically Aware Software Applications Enabling Semantically Aware Software Applications
Enabling Semantically Aware Software Applications Trish Whetzel
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesConnected Data World
 
Ontology Web Services for Semantic Applications
Ontology Web Services for Semantic Applications Ontology Web Services for Semantic Applications
Ontology Web Services for Semantic Applications Trish Whetzel
 
The Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in BiologyThe Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in Biologyrobertstevens65
 

Similaire à UniProt-GOA (20)

Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
 
TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...TAIR -Using biological ontologies to accelerate progress in plant biology res...
TAIR -Using biological ontologies to accelerate progress in plant biology res...
 
Lock - PomBase community curation
Lock - PomBase community curationLock - PomBase community curation
Lock - PomBase community curation
 
Designing a community resource - Sandra Orchard
Designing a community resource - Sandra OrchardDesigning a community resource - Sandra Orchard
Designing a community resource - Sandra Orchard
 
ICAR2016 TAIR talk
ICAR2016 TAIR talkICAR2016 TAIR talk
ICAR2016 TAIR talk
 
Ontology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation WorkflowOntology-based Tools to Enhance the Curation Workflow
Ontology-based Tools to Enhance the Curation Workflow
 
NCBO Tools and Web services
NCBO Tools and Web servicesNCBO Tools and Web services
NCBO Tools and Web services
 
Building and Using Ontologies to do biology
Building and Using Ontologies to do biologyBuilding and Using Ontologies to do biology
Building and Using Ontologies to do biology
 
Gene Ontology WormBase Workshop International Worm Meeting 2015
Gene Ontology WormBase Workshop International Worm Meeting 2015Gene Ontology WormBase Workshop International Worm Meeting 2015
Gene Ontology WormBase Workshop International Worm Meeting 2015
 
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of GenomesApollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
Apollo and i5K: Collaborative Curation and Interactive Analysis of Genomes
 
bioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics databioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics data
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
Enabling Semantically Aware Software Applications
Enabling Semantically Aware Software Applications Enabling Semantically Aware Software Applications
Enabling Semantically Aware Software Applications
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Chibucos annot go_final
Chibucos annot go_finalChibucos annot go_final
Chibucos annot go_final
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical Sciences
 
Ontology Web Services for Semantic Applications
Ontology Web Services for Semantic Applications Ontology Web Services for Semantic Applications
Ontology Web Services for Semantic Applications
 
Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
The Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in BiologyThe Past, Present and Future of Knowledge in Biology
The Past, Present and Future of Knowledge in Biology
 

Plus de EBI

The annotation of plant proteins in UniProtKB
The annotation of plant proteins in UniProtKBThe annotation of plant proteins in UniProtKB
The annotation of plant proteins in UniProtKBEBI
 
InterPro and InterProScan 5.0
InterPro and InterProScan 5.0InterPro and InterProScan 5.0
InterPro and InterProScan 5.0EBI
 
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesGenome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesEBI
 
Automatic Annotation in UniProtKB
Automatic Annotation in UniProtKBAutomatic Annotation in UniProtKB
Automatic Annotation in UniProtKBEBI
 
The Vertebrate Genome Annotation Database
The Vertebrate Genome Annotation DatabaseThe Vertebrate Genome Annotation Database
The Vertebrate Genome Annotation DatabaseEBI
 
Train online
Train onlineTrain online
Train onlineEBI
 

Plus de EBI (6)

The annotation of plant proteins in UniProtKB
The annotation of plant proteins in UniProtKBThe annotation of plant proteins in UniProtKB
The annotation of plant proteins in UniProtKB
 
InterPro and InterProScan 5.0
InterPro and InterProScan 5.0InterPro and InterProScan 5.0
InterPro and InterProScan 5.0
 
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl GenomesGenome resources at EMBL-EBI: Ensembl and Ensembl Genomes
Genome resources at EMBL-EBI: Ensembl and Ensembl Genomes
 
Automatic Annotation in UniProtKB
Automatic Annotation in UniProtKBAutomatic Annotation in UniProtKB
Automatic Annotation in UniProtKB
 
The Vertebrate Genome Annotation Database
The Vertebrate Genome Annotation DatabaseThe Vertebrate Genome Annotation Database
The Vertebrate Genome Annotation Database
 
Train online
Train onlineTrain online
Train online
 

Dernier

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Dernier (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

UniProt-GOA

  • 1. Introduction to the Gene Ontology and GO annotation resources Rachael Huntley UniProt-GOA EBI PAGXX January 2012
  • 2. 2 Why do we need GO?
  • 3. Reasons for the Gene Ontology www.geneontology.org • Inconsistency in English language
  • 4. Inconsistency in English languauge • Same name for different concepts or ?? Cell
  • 5.  Comparison is difficult – in particular across species or across databases Just one reason why the Gene Ontology (GO) is is needed… • Different names for the same concept Eggplant Aubergine Brinjal Melongene Same for biological concepts
  • 6. Reasons for the Gene Ontology www.geneontology.org • Inconsistency in English language • Increasing amounts of biological data available • Increasing amounts of biological data to come
  • 7. Search on ‘DNA repair’... get almost 65,000 results Increasing amounts of biological data available Expansion of sequence information
  • 8. Reasons for the Gene Ontology www.geneontology.org • Inconsistency in English language • Large datasets need to be interpreted quickly • Increasing amounts of biological data available • Increasing amounts of biological data to come
  • 9. • A way to capture biological knowledge for individual gene products in a written and computable form The Gene Ontology • A set of concepts and their relationships to each other arranged as a hierarchy www.ebi.ac.uk/QuickGO Less specific concepts More specific concepts
  • 10. The Concepts in GO 1. Molecular Function 2. Biological Process 3. Cellular Component An elemental activity or task or job • protein kinase activity • insulin receptor activity A commonly recognised series of events • cell division Where a gene product is located • mitochondrion • mitochondrial matrix • mitochondrial inner membrane
  • 11. Anatomy of a GO term Unique identifierUnique identifier Term nameTerm name DefinitionDefinitionSynonymsSynonyms Cross-referencesCross-references
  • 12. Ontology structure • Directed acyclic graph Terms can have more than one parent • Terms are linked by relationships is_a part_of regulates (and +/- regulates) www.ebi.ac.uk/QuickGOoccurs_in has_part These relationships allow for complex analysis of large datasets
  • 14. • Compile the ontologies - currently over 35,000 terms - constantly increasing and improving • Annotate gene products using ontology terms - around 30 groups provide annotations • Provide a public resource of data and tools - regular releases of annotations - tools for browsing/querying annotations and editing the ontology Aims of the GO project
  • 16. UniProt-Gene Ontology Annotation (UniProt-GOA) database at the EBI • Largest open-source contributor of annotations to GO • Provides annotation for more than 397,000 species • Our priority is to annotate the human proteome
  • 17. A GO annotation is … …a statement that a gene product; 1. has a particular molecular function or is involved in a particular biological process or is located within a certain cellular component 2. as determined by a particular method 3. as described in a particular reference P00505 Accession Name GO ID GO term name Reference Evidence code IDAPMID:2731362aspartate transaminase activityGO:0004069GOT2
  • 18. Electronic Annotation Manual Annotation UniProt-GOA incorporates annotations made using two methods • Quick way of producing large numbers of annotations • Annotations use less-specific GO terms • Time-consuming process producing lower numbers of annotations • Annotations tend to use very specific GO terms • Only source of annotation for many non-model organism species
  • 19. 1. Mapping of external concepts to GO terms e.g. InterPro2GO, UniProt Keyword2GO, Enzyme Commission2GO Electronic annotation methods GO:0004715 ; non-membrane spanning protein tyrosine kinase activity
  • 20. Annotations are high-quality and have an explanation of the method (GO_REF) Macaque Mouse DogCow Guinea PigChimpanzee Rat Chicken Ensembl compara 2. Automatic transfer of manual annotations to orthologs ...and more e.g. Human Arabidopsis Rice Brachypodium Maize Poplar Grape …and moreEnsembl compara Electronic annotation methods
  • 21. Manual annotation by GOA High–quality, specific annotations made using: • Full text peer-reviewed papers • A range of evidence codes to categorise the types of evidence found in a paper e.g. IDA, IMP, IPI http://www.ebi.ac.uk/GOA
  • 22. * Includes manual annotations integrated from external model organism and specialist groups 920,557Manual annotations* 110,247,289Electronic annotations Dec 2011 Statistics Number of annotations in UniProt-GOA database
  • 23. How to access and use GO annotation data
  • 24. Where can you find annotations? UniProtKB Ensembl Entrez gene
  • 25. GO Consortium website Gene Association Files 17 column files containing all information for each annotation http://www.ebi.ac.uk/GOA/downloads.html UniProt-GOA website Numerous species-specific files
  • 27. http://www.ebi.ac.uk/QuickGO The EBI's QuickGO browser Search GO terms or proteins Find sets of GO annotations
  • 28. • Access gene product functional information • Analyse high-throughput genomic or proteomic datasets • Validation of experimental techniques • Get a broad overview of a proteome • Obtain functional information for novel gene products How scientists use the GO Some examples…
  • 29. Term enrichment • Most popular type of GO analysis • Determines which GO terms are more often associated with a specified list of genes/proteins compared with a control list or rest of genome • Many tools available to do this analysis • User must decide which is best for their analysis
  • 30. Numerous Third Party Tools www.geneontology.org/GO.tools.shtml
  • 31. pears on lw n3d ... : Set_LW_n3d_5p_... Colored by: Copy of Copy of C5_RMA (Defa... Gene List: all genes (14010) attacked time control Puparial adhesion Molting cycle Hemocyanin Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Immune response Toll regulated genes Amino acid catabolism Lipid metobolism Peptidase activity Protein catabolism Immune response cted Gene Tree: pearson lw n3d ... ch color classification: Set_LW_n3d_5p_... Colored by: Copy of Copy of C5_RMA (Defa... Gene List: all genes (14010) Bregje Wertheim at the Centre for Evolutionary Genomics, Department of Biology, UCL and Eugene Schuster Group, EBI. MicroArray data analysis Analysis of high-throughput genomic datasets
  • 32. Validation of experimental techniques (Cao et al., Journal of Proteome Research 2006) Rat liver plasma membrane isolation
  • 33. Annotating novel sequences • Can use BLAST queries to find similar sequences with GO annotation which can be transferred to the new sequence • Two tools currently available; AmiGO BLAST – searches the GO Consortium database BLAST2GO – searches the NCBI database
  • 34. Using the GO to provide a functional overview for a large dataset • Many GO analysis tools use GO slims to give a broad overview of the dataset • GO slims are cut-down versions of the GO and contain a subset of the terms in the whole GO • GO slims usually contain less-specialised GO terms
  • 35. Slimming the GO using the ‘true path rule’ Many gene products are associated with a large number of descriptive, leaf GO nodes:
  • 36. Slimming the GO using the ‘true path rule’ …however annotations can be mapped up to a smaller set of parent GO terms:
  • 37. GO slims or you can make your own using; Custom slims are available for download; http://www.geneontology.org/GO.slims.shtml • AmiGO's GO slimmer • QuickGO http://www.ebi.ac.uk/QuickGO http://amigo.geneontology.org/cgi-bin/amigo/slimmer
  • 38. www.ebi.ac.uk/QuickGO Map-up annotations with GO slims The EBI's QuickGO browser Search GO terms or proteins Find sets of GO annotations Questions on how to use QuickGO? Contact goa@ebi.ac.uk
  • 39. The UniProt-GOA group Curators: Software developer: Team leaders: Emily Dimmer Rachael Huntley Rolf Apweiler Email: goa@ebi.ac.uk http://www.ebi.ac.uk/GOA Claire O’Donovan Tony Sawford Yasmin Alam-Faruque Prudence Mutowo
  • 40. Acknowledgements GO Consortium Members of; UniProtKB InterPro IntAct HAMAP Ensembl Ensembl Genomes Funding National Human Genome Research Institute (NHGRI) Kidney Research UK British Heart Foundation EMBL

Notes de l'éditeur

  1. The English language is not precise
  2. Electronic annotations are important for non-model species as these do not have a dedicated manual annotation effort
  3. Genes differentially expressed during parasitoid attack versus control