Micromeritics - Fundamental and Derived Properties of Powders
My ontology is better than yours! Building and evaluating ontologies for integrative research
1. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
My ontology is better than yours!
Building and evaluating ontologies for integrative research
Robert Hoehndorf
Department of Genetics
University of Cambridge
Bio-Ontology SIG
2.
3.
4. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Translational research
National Cancer Institute:
Translational research transforms scientific discoveries arising from
laboratory, clinical, or population studies into clinical applications
to reduce [disease] incidence, morbidity, and mortality.
6. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Gruber (1993):
An ontology is the explicit specification of a conceptualization of a
domain.
controlled vocabularies
hierarchically organized
facilitate data integration
7. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
8. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Individual
Physical object Quality Function Process
ChEBI Ontology Molecule
Gene
Sequence Ontology
Transcript
GO-CC Organelle
Celltype Gene Ontology Cell
Phenotype Tissue
Ontology Organ
Anatomy
Ontology
Body
Population
9. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
How can we find the “best” ontology?
How can we develop the “best” ontology?
10. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Ontology evaluation
11. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Evaluation criteria
ontology design principles rooted in
best practices
philosophy
logic
ontology engineering
linguistics
community agreement
community requests
peer review
12. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Ontology
Ontology evaluation
definitions
singular nouns
common relations
single is-a hierarchy
orthogonality
realism
...
13. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Most ontology evaluation criteria are intrinsic criteria and evaluate
what ontologies are.
14. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Most ontology evaluation criteria are intrinsic criteria and evaluate
what ontologies are.
How can we evaluate what ontologies do?
15. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
A functional perspective
16. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Evaluation criteria
criteria from software engineering, etc.
user study
unit tests
complexity
...
17. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
A functional perspective
18. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Biomedical ontologies
Evaluation criteria
criteria from biology
experiments
statistics (p-values)
comparison to gold/silver standard
...
21. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Research questions
drug discovery
drug repurposing
drug response
drug pathways
disease pathways
causal mutations
22. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Research questions
drug discovery
drug repurposing
drug response
drug pathways
disease pathways
causal mutations
23. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Traditional approaches to drug repurposing
drug target identification
models of drug binding
experiment design and execution (e.g., binding assays)
analysis and interpretation of experiment results
24. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Integrative approaches to drug repurposing
SIDER
text mining of drug labels
side-effect similarity
UMLS
PREDICT
disease–disease similarity
drug–drug similarity
disease phenotypes, gene functions, side effects, chemical
structure, protein interactions, text mining
HPO, MESH, GO
OFFSIDES
adverse event reports
ATC, UMLS
25. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Pharmacogenomics
Can we get some novel information about drug indications (and
causal mutations) by analyzing experimental data from animal
models?
26. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Approach
27. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Approach
28. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Relevant ontologies
Mammalian Phenotype Ontology
9,161 classes
manually developed
annotation of animal models
formal (EQ) definitions
Human Phenotype Ontology
9,796 classes
manually developed
annotation of diseases
formal (EQ) definitions
29. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Challenges
1 comparison of human and mouse phenotypes
cross-species integration
how do we represent phenotypes?
2 computation of similarity
semantic similarity based on ontology taxonomy
which ontology do we use for computing similarity?
30. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Cross-species phenotype integration
representation of MP and HPO phenotypes
PATO-based formal definitions
GO
homologous and analogous anatomical structures (UBERON)
aim: cross-species integration of phenotypes
31. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
What are phenotypes and how do we represent them (for
cross-species integration)?
Abnormal appendix: E=Appendix, Q=Abnormal
representation:
appendix with quality Abnormal
quality Abnormal of some appendix
organism with appendix that has quality Abnormal
...
inheritance of phenotypes across parthood
Abnormality of tip of appendix subclass of Abnormality of
appendix?
absence of appendix
32. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Semantic similarity results depend on
the number of distinctions made by ontology developers
the kind of distinctions made by ontology developers
the data that is analyzed
the similarity measure
33. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Should we compute phenotypic similarity based on the Human or
the Mammalian Phenotype Ontology (or both)? How can we
compare the results?
34. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Ontology design decisions can be resolved empirically!
no a priori “right” way to represent phenotypes
focus on scientific results, not representation
evaluation:
empirical
objective
quantitative
external
35. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Ontology design decisions can be resolved empirically!
finish the analysis
use known gene–disease associations as gold standard
use FDA-approved drug indications as gold standard
compare analysis results against gold standard
36. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity over phenotype ontologies measures
phenotypic similarity
semantic similarity
pairwise comparison of disease and animal phenotypes
IC (x)
x∈Cl(P)∩Cl(D)
sim(P, D) =
IC (y )
y ∈Cl(P)∪Cl(D)
37. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
PhenomeNET compares phenotypes across species
ranking of gene for each disease
candidate genes for disease
38. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Statistical testing to rank drug–disease pairs
one-sided Wilcoxon signed rank test
result: ranking of drugs for each disease based on p-value
low p-value: mutations in mouse genes associated with a drug
result in phenotypes that are very similar to a disease
phenotype
high p-value: genes uniformly distributed across ranks
39. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Receiver Operating Characteristic
Source: Wikipedia
40. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Gene-disease associations
PhenomeNet initial
1
0.9
0.8
0.7
True Positive Rate
0.6 AUC: original 0.68
0.5
0.4
0.3
0.2
0.1
x
original
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
False Positive Rate
44. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Representation of phenotypes for cross-species integration
’Abnormality of appendix’ EquivalentTo: has-part
some (part-of some (Appendix and has-quality some
Quality))
organism-centric approach (has-part some)
transitivity over parthood (part-of some)
Quality used as indicator of abnormality
use of OWL EL
45. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Representation of phenotypes for cross-species integration
’Large appendix’ EquivalentTo: has-part some
(Appendix and has-quality some ’Increased size’)
organism-centric approach (has-part some)
no transitivity over parthood
use of OWL EL
46. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Absence
’Absence of appendix’ EquivalentTo: has-part some
(Appendix and has-quality some Absent)
subclass of Abnormality of appendix
use of OWL EL
47. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Should we compute phenotypic similarity based on the Human or
the Mammalian Phenotype Ontology (or both)? How can we
compare the results?
48. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Semantic similarity
Computation of semantic similarity using the Mammalian
Phenotype Ontology improves the analysis results.
problem specific
depending on mouse data
depending on the approach
depending on similarity measure
depending on gold standard dataset
49. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Conclusion
Quantitative, external evaluation can improve ontologies and
ontology-based analysis methods.
50. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Annotation
Definitions:
intrinsic:
having definitions
Aristotelian definitions
external:
having definitions that are easily understandable
having definitions that improve annotation consistency
criteria:
measure annotation consistency
user study
Dolan, M. E., et al. A procedure for assessing GO annotation consistency. Bioinformatics 21, i136–i143 (2005).
51. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Annotation
Labels:
intrinsic:
singular nouns
reference to universals
external:
use of common, widely used terms
use of unambiguous terms
criteria:
measure annotation consistency
user study
recall in text
Yao, L., et al. Benchmarking Ontologies: Bigger or Better? PLoS Comput Biol 7, e1001055 (Jan. 2011).
52. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Knowledge bases and querying
Queries:
intrinsic:
use of OWL
use of specific relations
use of upper level ontology
consistency
external:
retrieve correct answers
retrieve relevant answers
criteria:
user study (to evaluate query answers)
test set
comparison to gold standard
Boeker, M., et al. Unintended consequences of existential quantifications in biomedical ontologies. BMC
Bioinformatics 12, 456 (2011).
53. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Conclusions
My ontology is better than yours.
54. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Conclusions
My ontology is better than yours.
My ontology can do some things better than your ontology.
55. Introduction Biomedical ontology Use case: pharmacogenomics Outlook
Conclusions
Quantitative criteria
Empirical, objective, quantitative, application-based evaluation will
allow us to systematically improve ontologies for science.