SlideShare une entreprise Scribd logo
1  sur  1
Télécharger pour lire hors ligne
Critical Assessment of Function Annotations: Lessons Learned and the Road Ahead
1,*

2,3

4

4

5

Iddo Friedberg , Wyatt T Clark , Alexandra M Schnoes , Patricia C Babbitt , Sean D Mooney and Predrag Radivojac
Introduction

To understand and improve our ability to computationally
annotate proteins, we are holding a series of multi-year
challenges to the developers of function annotation programs.
The rationale being that having these programs challenged and
assessed will lead to understanding and improving predictive
ability. The first critical assessment of Function Annotation
(CAFA 1) was held over 2010-2011, involved 23 research
groups and assessed the performance of 54 algorithms.
CAFA 1 was structured as a time-challenge, where proteins
which had no experimentally-validated function annotation
were presented to the methods, and their function was
predicted. Over the course of 10 months, some of these
proteins gained experimental validation, and those were used
as the final benchmark to assess program performance.

Participating Methods

Predictions on Human and Mouse

Understanding protein function is a key component to
understanding life at a molecular level. It is also important for
understanding and treating human disease, since many
conditions arise as a consequence of the loss or gain of protein
function.

2

BPO

MFO

Database Bias

Here we review CAFA 1, and introduce CAFA 2, which is taking
place 2013-2014.

There is extensive bias in experimentally
validated annotations in Uniprot-GOA.
The bias is contributed by high
throughput experiments.

Many HT experimental annotations
create redundancies

Case Study: hPNPase

The CAFA Experiment: Generating Targets

A circle represents the sum total of articles annotating each organism. Each colored
arch is composed of all the proteins in a single article. A line is drawn between any
two points on the circle if the proteins they represent have 100% sequence identity. A
black line is drawn if they are annotated with a different ontology (for example, in one
article the protein is annotated with the MFO, and in another article with BPO); a red
line if they are annotated in the same ontology. Example: S. pombe is described by
two articles, one with few protein (light arch on bottom) and one with many (dark arch
encompassing most of circle). Many of the same proteins are annotated by both
articles.

New in CAFA 2

Assessing Method Performance

Engaging more communities
Human Phenotype Ontology

Precision: pr = TP/(TP+FP)
Recall:
rc = TP/(TP+FN)

pr +rc
F1 = 2×(
)
pr×rc

Experimental
Biologists

Computer
Scientists

Cellular Component Ontology
(a) Domain architecture of human PNPT1 gene according to the Pfam classification. For each domain, the
numbers of different leaf terms (associated with any protein in Swiss-Prot database containing this domain
are shown.
(b) Molecular Function terms (six of which are leaves) associated with the human PNPT1 gene in
Swiss-Prot as of December 2011. Colored circles represent the predicted terms for three representative
methods as well as two baseline methods. The prediction threshold for each method was selected to
correspond to the point in the precision-recall space that provides the maximum F-measure. J (blue),
Jones-UCL; O (magenta), Team Orengo; d (navy blue), dcGO; B (green), BLAST; N (brown), Naive.
Dashed lines indicate the presence of other terms between the source and destination nodes.

Steering Committee

Organizing Committee

Data Wrangler

Iddo Friedberg
Michal Linial
Mark Wass
Sean D Mooney
Predrag Radivojac

Tal Ronen Oron

Algorithms,
Assessment methods
Download poster

Go to our website

Computational
Biologists

CAFA 2 Assessor

Patricia Babbitt
Steven Brenner
Christine Orengo
Burkhard Rost

Reassessing CAFA 1 methods

Targets

CAFA

Targets &
Ontologies

Biocurators

Anna Tramontano

Author Affiliations
1. Miami University, Oxford OH
2. Indiana University, Bloomington, IN
3. Yale University, New Haven, MA
4. University of California San Francisco, CA
5. Buck Institute for Research on Aging, CA
* i.friedberg@miamioh.edu

References and more information
CAFA: Radivojac et al (2013) Nature Methods doi:10.1038/nmeth.2340
http://BioFunctionPrediction.org
Database Bias: Schnoes et al (2013) PLoS Computational Biology
doi:10.1371/journal.pcbi.1003063

Contenu connexe

Tendances

Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression DatabaseКолкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
bigdatabm
 
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Chris Southan
 
Gcc talk baltimore july 2014
Gcc talk baltimore july 2014Gcc talk baltimore july 2014
Gcc talk baltimore july 2014
pratikomics
 

Tendances (12)

Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...
 
New Target Prediction and Visualization Tools Incorporating Open Source Molec...
New Target Prediction and Visualization Tools Incorporating Open Source Molec...New Target Prediction and Visualization Tools Incorporating Open Source Molec...
New Target Prediction and Visualization Tools Incorporating Open Source Molec...
 
Database Of Rose Varieties Eucarpia Leiden 2009
Database Of Rose Varieties Eucarpia Leiden 2009Database Of Rose Varieties Eucarpia Leiden 2009
Database Of Rose Varieties Eucarpia Leiden 2009
 
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression DatabaseКолкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
Колкер Е. An introduction to MOPED: Multi-Omics Profiling Expression Database
 
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
 
GI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSONGI 2013 - ENCODE Project Data Access via RESTful API and JSON
GI 2013 - ENCODE Project Data Access via RESTful API and JSON
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David Amar
 
Characteristics of biological databases
Characteristics of biological databasesCharacteristics of biological databases
Characteristics of biological databases
 
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
Exploiting Edinburgh's Guide to PHARMACOLOGY database as a source of protein ...
 
The Monarch Initiative Phenotype Grid
The Monarch Initiative Phenotype GridThe Monarch Initiative Phenotype Grid
The Monarch Initiative Phenotype Grid
 
EThOS for EAP: The PhD Abstracts Collections in FLAX with the British Library...
EThOS for EAP: The PhD Abstracts Collections in FLAX with the British Library...EThOS for EAP: The PhD Abstracts Collections in FLAX with the British Library...
EThOS for EAP: The PhD Abstracts Collections in FLAX with the British Library...
 
Gcc talk baltimore july 2014
Gcc talk baltimore july 2014Gcc talk baltimore july 2014
Gcc talk baltimore july 2014
 

Similaire à CAFA poster presented at CSHL Genome Informatics 2013

Internship Report
Internship ReportInternship Report
Internship Report
Neha Gupta
 
Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...
Andrei KUCHARAVY
 
Sophie F. summer Poster Final
Sophie F. summer Poster FinalSophie F. summer Poster Final
Sophie F. summer Poster Final
Sophie Friedheim
 
Prediction of the in vitro permeability determined in Caco-2 cells by using a...
Prediction of the in vitro permeability determined in Caco-2 cells by using a...Prediction of the in vitro permeability determined in Caco-2 cells by using a...
Prediction of the in vitro permeability determined in Caco-2 cells by using a...
PPaixao
 
Liu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FRLiu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FR
姜圆 刘
 

Similaire à CAFA poster presented at CSHL Genome Informatics 2013 (20)

Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)
 
Internship Report
Internship ReportInternship Report
Internship Report
 
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSIONCOMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
COMPUTATIONAL METHODS FOR FUNCTIONAL ANALYSIS OF GENE EXPRESSION
 
Grafström - Lush Prize Conference 2014
Grafström - Lush Prize Conference 2014Grafström - Lush Prize Conference 2014
Grafström - Lush Prize Conference 2014
 
Update on the Druggable Proteome
Update on the Druggable ProteomeUpdate on the Druggable Proteome
Update on the Druggable Proteome
 
Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...
 
MORPH-R article
MORPH-R articleMORPH-R article
MORPH-R article
 
Prognostic and clinicopathological significance of programmed death ligand 1 ...
Prognostic and clinicopathological significance of programmed death ligand 1 ...Prognostic and clinicopathological significance of programmed death ligand 1 ...
Prognostic and clinicopathological significance of programmed death ligand 1 ...
 
An Agent-Based System For Re-Annotation Of Genomes
An Agent-Based System For Re-Annotation Of GenomesAn Agent-Based System For Re-Annotation Of Genomes
An Agent-Based System For Re-Annotation Of Genomes
 
Identification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databasesIdentification of PFOA linked metabolic diseases by crossing databases
Identification of PFOA linked metabolic diseases by crossing databases
 
T-BioInfo Methods and Approaches
T-BioInfo Methods and ApproachesT-BioInfo Methods and Approaches
T-BioInfo Methods and Approaches
 
T-bioinfo overview
T-bioinfo overviewT-bioinfo overview
T-bioinfo overview
 
Sophie F. summer Poster Final
Sophie F. summer Poster FinalSophie F. summer Poster Final
Sophie F. summer Poster Final
 
COMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.pptCOMPARATIVE GENOMICS.ppt
COMPARATIVE GENOMICS.ppt
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria López
 
te_poster_sping_2016
te_poster_sping_2016te_poster_sping_2016
te_poster_sping_2016
 
Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212Network Pharmacology Tri-Con 022212
Network Pharmacology Tri-Con 022212
 
How to analyse large data sets
How to analyse large data setsHow to analyse large data sets
How to analyse large data sets
 
Prediction of the in vitro permeability determined in Caco-2 cells by using a...
Prediction of the in vitro permeability determined in Caco-2 cells by using a...Prediction of the in vitro permeability determined in Caco-2 cells by using a...
Prediction of the in vitro permeability determined in Caco-2 cells by using a...
 
Liu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FRLiu_Jiangyuan_1201662_FR
Liu_Jiangyuan_1201662_FR
 

Plus de Iddo

Metagenomics Biocuration 2013
Metagenomics Biocuration 2013Metagenomics Biocuration 2013
Metagenomics Biocuration 2013
Iddo
 
Ismb grant-writing-2012
Ismb grant-writing-2012Ismb grant-writing-2012
Ismb grant-writing-2012
Iddo
 
David Jones AFP/CAFA2011
David Jones AFP/CAFA2011David Jones AFP/CAFA2011
David Jones AFP/CAFA2011
Iddo
 
Vienna afp2011
Vienna afp2011Vienna afp2011
Vienna afp2011
Iddo
 
Afp cafa djuric
Afp cafa djuricAfp cafa djuric
Afp cafa djuric
Iddo
 
Go camp 2010_cacao
Go camp 2010_cacaoGo camp 2010_cacao
Go camp 2010_cacao
Iddo
 

Plus de Iddo (20)

What can Community Challenges do for You?
What can Community Challenges do for You?What can Community Challenges do for You?
What can Community Challenges do for You?
 
Surviving Scientific Presentations
Surviving Scientific PresentationsSurviving Scientific Presentations
Surviving Scientific Presentations
 
Friedberg lab-overview-grad-students-2019-nr
Friedberg lab-overview-grad-students-2019-nrFriedberg lab-overview-grad-students-2019-nr
Friedberg lab-overview-grad-students-2019-nr
 
The roles communities play in improving bioinformatics: better software, bett...
The roles communities play in improving bioinformatics: better software, bett...The roles communities play in improving bioinformatics: better software, bett...
The roles communities play in improving bioinformatics: better software, bett...
 
Why Your Microbiome Analysis is Wrong
Why Your Microbiome Analysis is WrongWhy Your Microbiome Analysis is Wrong
Why Your Microbiome Analysis is Wrong
 
Tracing the Ancestry of Genomes in Bacteria
Tracing the Ancestry of Genomes in BacteriaTracing the Ancestry of Genomes in Bacteria
Tracing the Ancestry of Genomes in Bacteria
 
Computational Challenges in Biological Data Science: an Optimistically Cautio...
Computational Challenges in Biological Data Science: an Optimistically Cautio...Computational Challenges in Biological Data Science: an Optimistically Cautio...
Computational Challenges in Biological Data Science: an Optimistically Cautio...
 
Friedberg lab-overview-grad-students
Friedberg lab-overview-grad-studentsFriedberg lab-overview-grad-students
Friedberg lab-overview-grad-students
 
Understanding Biological Function in Times of High Throughput and Low Output
Understanding Biological Function in Times of High Throughput and Low OutputUnderstanding Biological Function in Times of High Throughput and Low Output
Understanding Biological Function in Times of High Throughput and Low Output
 
Random Musings on Fixing Data Shambles in Science
Random Musings on Fixing Data Shambles in ScienceRandom Musings on Fixing Data Shambles in Science
Random Musings on Fixing Data Shambles in Science
 
Genome Informatics 2015 Bacteriocin Discovery
Genome Informatics 2015 Bacteriocin DiscoveryGenome Informatics 2015 Bacteriocin Discovery
Genome Informatics 2015 Bacteriocin Discovery
 
Convergent divergent
Convergent divergentConvergent divergent
Convergent divergent
 
Some US Science Funding sources
Some US Science Funding sourcesSome US Science Funding sources
Some US Science Funding sources
 
Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013
 
Metagenomics Biocuration 2013
Metagenomics Biocuration 2013Metagenomics Biocuration 2013
Metagenomics Biocuration 2013
 
Ismb grant-writing-2012
Ismb grant-writing-2012Ismb grant-writing-2012
Ismb grant-writing-2012
 
David Jones AFP/CAFA2011
David Jones AFP/CAFA2011David Jones AFP/CAFA2011
David Jones AFP/CAFA2011
 
Vienna afp2011
Vienna afp2011Vienna afp2011
Vienna afp2011
 
Afp cafa djuric
Afp cafa djuricAfp cafa djuric
Afp cafa djuric
 
Go camp 2010_cacao
Go camp 2010_cacaoGo camp 2010_cacao
Go camp 2010_cacao
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Dernier (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 

CAFA poster presented at CSHL Genome Informatics 2013

  • 1. Critical Assessment of Function Annotations: Lessons Learned and the Road Ahead 1,* 2,3 4 4 5 Iddo Friedberg , Wyatt T Clark , Alexandra M Schnoes , Patricia C Babbitt , Sean D Mooney and Predrag Radivojac Introduction To understand and improve our ability to computationally annotate proteins, we are holding a series of multi-year challenges to the developers of function annotation programs. The rationale being that having these programs challenged and assessed will lead to understanding and improving predictive ability. The first critical assessment of Function Annotation (CAFA 1) was held over 2010-2011, involved 23 research groups and assessed the performance of 54 algorithms. CAFA 1 was structured as a time-challenge, where proteins which had no experimentally-validated function annotation were presented to the methods, and their function was predicted. Over the course of 10 months, some of these proteins gained experimental validation, and those were used as the final benchmark to assess program performance. Participating Methods Predictions on Human and Mouse Understanding protein function is a key component to understanding life at a molecular level. It is also important for understanding and treating human disease, since many conditions arise as a consequence of the loss or gain of protein function. 2 BPO MFO Database Bias Here we review CAFA 1, and introduce CAFA 2, which is taking place 2013-2014. There is extensive bias in experimentally validated annotations in Uniprot-GOA. The bias is contributed by high throughput experiments. Many HT experimental annotations create redundancies Case Study: hPNPase The CAFA Experiment: Generating Targets A circle represents the sum total of articles annotating each organism. Each colored arch is composed of all the proteins in a single article. A line is drawn between any two points on the circle if the proteins they represent have 100% sequence identity. A black line is drawn if they are annotated with a different ontology (for example, in one article the protein is annotated with the MFO, and in another article with BPO); a red line if they are annotated in the same ontology. Example: S. pombe is described by two articles, one with few protein (light arch on bottom) and one with many (dark arch encompassing most of circle). Many of the same proteins are annotated by both articles. New in CAFA 2 Assessing Method Performance Engaging more communities Human Phenotype Ontology Precision: pr = TP/(TP+FP) Recall: rc = TP/(TP+FN) pr +rc F1 = 2×( ) pr×rc Experimental Biologists Computer Scientists Cellular Component Ontology (a) Domain architecture of human PNPT1 gene according to the Pfam classification. For each domain, the numbers of different leaf terms (associated with any protein in Swiss-Prot database containing this domain are shown. (b) Molecular Function terms (six of which are leaves) associated with the human PNPT1 gene in Swiss-Prot as of December 2011. Colored circles represent the predicted terms for three representative methods as well as two baseline methods. The prediction threshold for each method was selected to correspond to the point in the precision-recall space that provides the maximum F-measure. J (blue), Jones-UCL; O (magenta), Team Orengo; d (navy blue), dcGO; B (green), BLAST; N (brown), Naive. Dashed lines indicate the presence of other terms between the source and destination nodes. Steering Committee Organizing Committee Data Wrangler Iddo Friedberg Michal Linial Mark Wass Sean D Mooney Predrag Radivojac Tal Ronen Oron Algorithms, Assessment methods Download poster Go to our website Computational Biologists CAFA 2 Assessor Patricia Babbitt Steven Brenner Christine Orengo Burkhard Rost Reassessing CAFA 1 methods Targets CAFA Targets & Ontologies Biocurators Anna Tramontano Author Affiliations 1. Miami University, Oxford OH 2. Indiana University, Bloomington, IN 3. Yale University, New Haven, MA 4. University of California San Francisco, CA 5. Buck Institute for Research on Aging, CA * i.friedberg@miamioh.edu References and more information CAFA: Radivojac et al (2013) Nature Methods doi:10.1038/nmeth.2340 http://BioFunctionPrediction.org Database Bias: Schnoes et al (2013) PLoS Computational Biology doi:10.1371/journal.pcbi.1003063