SlideShare une entreprise Scribd logo
1  sur  121
How to Build a Biomedical Ontology

               Success Stories
         The Gene Ontology (GO)
 SNOMED, ICD and other controlled vocabularies
        Ontology Design Principles
           Ontology Applications


             Barry Smith
  http://ontology.buffalo.edu/smith
Uses of ‘ontology’ in PubMed abstracts




                                         2
3
By far the most successful: GO (Gene Ontology)




                                           4
5
Hierarchical view of GO
representing relations
between represented types

                            6
Gene Ontology
$100 mill. invested in literature and database
curation using the Gene Ontology (GO)
based on the idea of annotation
over 11 million annotations relating gene
products (proteins) described in the UniProt,
Ensembl and other databases to terms in the
GO
multiple secondary uses – because the
ontology was not built to meet one specific
set of requirements
                                                 7
GO provides a controlled system of terms
for use in annotating (describing, tagging)
                    data
• multi-species, multi-disciplinary, open
  source
• contributing to the cumulativity of
  scientific results obtained by distinct
  research communities
• compare use of kilograms, meters,
  seconds in formulating experimental
  results                                   8
Sample Gene Array Data




                         9
semantic annotation of data


 where in the cell ?


   what kind of
molecular function ?


    what kind of
biological process?
                                  10
natural language labels

 to make the data cognitively
 accessible to human beings


                                11
compare: legends for maps




                            12
compare: legends for diagrams




                          13
ontologies are legends for data


                                  14
compare: legends for maps




                            15
ontologies are legends for images


                                16
what lesion ?

what brain function ?




                        17
ontologies are legends for databases
MouseEcotope                    GlyProt

                 sphingolipid
                  transporter
                     activity




  DiabetInGene


     GluChem
                                18
annotation using common ontologies
   yields integration of databases
MouseEcotope                         GlyProt




                 Holliday junction
                 helicase complex

  DiabetInGene


     GluChem
                                     19
annotation using common ontologies
  can support comparison of data




                             20
annotation with Gene Ontology
supports reusability of data
supports search of data by humans
supports comparison of data
supports aggregation of data
supports reasoning with data by humans
  and machines


                                         21
22
The goal: virtual science
• consistent (non-redundant) annotation
• cumulative (additive) annotation

 yielding, by incremental steps, a
 virtual map of the entirety of reality
 that is accessible to computational
 reasoning


                                          23
This goal is realizable if we have a
  common ontology framework
 data is retrievable
 data is comparable
 data is integratable


   only to the degree that it is annotated
   using a common controlled vocabulary
   – compare the role of seconds, meters,
   kilograms … in unifying science
                                         24
To achieve this end we have to engage
   in something like philosophy (?)




  is this the right way to organize the top level of this
  portion of the GO?
  how does the top level of this ontology relate to
  the top levels of other, neighboring ontologies?          25
Strategy for doing this
see the world as organized via
types/universals/categories which are
hierarchically organized

and in relation to which statements
can be formulated which are
universally true of all instances:

  cell membrane part_of cell          26
Anatomical
        Anatomical Space
                                                          Structure


Organ Cavity           Organ
                                          Organ                          Organ Part
 Subdivision           Cavity


 Serous Sac          Serous Sac                           Organ            Organ
   Cavity              Cavity
                                        Serous Sac      Component        Subdivision
                                                                                        Tissue
 Subdivision
is_a



                                                 Pleural Sac
                                                  Pleural Sac            Pleura(Wall
                        Pleural                                           Pleura(Wall
                         Pleural                                            of Sac)
                                                                             of Sac)
                         Cavity




                                                                                        of
                          Cavity
                                          Parietal
                                           Parietal
                                           Pleura




                                                                                       t_
                                            Pleura                    Visceral
                                                                       Visceral
               Interlobar                                             Pleura
                                                                       Pleura
                Interlobar




                                                                               r
                 recess
                  recess           Mediastinal




                                                                            pa
                                   Mediastinal
                                    Pleura
                                     Pleura             Mesothelium
                                                        Mesothelium
                                                         of Pleura
                                                          of Pleura
                                                                                           27
               Foundational Model of Anatomy Ontology
species,
                                     substance
genera
                              organism

                         animal

                mammal

          cat

                                         frog
siamese



instances
                                                 28
29
the problem of continuity of care:
                   patients move around
with thanks to http://dbmotion.com                      30
f
f
           f



                f
                       f
f

synchronic and diachronic problems of
       semantic interoperability
    (across space and across time)
                                        31
f
f
            f



                   f
                EHR 1                 EHR 2
                         f
f
     how can we link EHR 1 to EHR 2 in a
    reliable, trustworthy, useful way, which
        both systems can understand ?

                                               32
f
f
           f
                  ICD

                  f
               EHR 1                 EHR 2
                        f
f

           the ideal solution:
    WHO International Classification of
                Diseases
                                          33
ICD
PRO:
De facto US billing standard
Multilanguage
CON:
De facto US billing standard (corrupts data)
No definitions of terms, and so difficult to
 judge accuracy of hierarchy and of coding
Inconsistent hierarchies
Hard to reason with results
Hence few secondary uses e.g. for research

                                                34
ICD 11
The (ontology-based) plan
multiple   views including
 ◦ billing
 ◦ public health statistics
 ◦ research
 ◦ SNOMED compatibility




                              35
f
f
           f
               SNOMED-CT


                  f
               EHR 1                  EHR 2
                         f
f
              the ideal solution:
    a single universal clinical vocabulary


                                             36
SNOMED CT: Systematized Nomenclature of
Medicine-Clinical Terms

PRO:
  International standard (sort of)
  Huge resource
  Free for member countries
  Multi-language (including Spanish)



                                          37
SNOMED CT
CON
Huge    (but redundant ... and gappy)
Contains many examples of false synonymy
Still in need of work
  ◦   No consistent interpretation of relations
  ◦   Many erroneous relation assertions
  ◦   Many idiosyncratic relations
  ◦   Mixes ontology with epistemology
  ◦   It contains numerous compound terms (e.g., test for X)
      without the constituent terms (here: X), even where the
      latter are of obvious salience
                                                                38
SNOMED CT
Coding  with SNOMED-CT is unreliable and
 inconsistent
Multi-stage multi-committee process for adding
 terms that follows intuitive rules and not formal
 principles
Does there exist a strategy for evolutionary
 improvement?




                                                     39
f
 f
             f
                 SNOMED-CT


                    f
                 EHR 1          EHR 2
                          f
  fan

 above all: SNOMED CT cannot solve the
problem of continuity of care because it has
           too much redundancy
                                               40
f
f
              f
                  SNOMED-CT


                     f
                  EHR 1          EHR 2
                           f
fan
    AND because it is used only in certain
                countries

                                             41
f
f                  Unified Medical
              f   Language System
                       (UMLS)


                     f
                  EHR 1                 EHR 2
                               f
f
    link EHR 1 to EHR 2 through a snapshot of
     the patient’s condition which both systems
                   can understand

                                              42
Unified Medical Language System (UMLS)
   UMLS    is not unified, not a language, not a
   system (and not only medical); it is an
   aggregation
   If we use something like UMLS as reference
   terminology, we will not solve the translation
   problem
          EN



                                   DE
R T U New York State
           Center of Excellence in
           Bioinformatics & Life
           Sciences
UMLS approach to countering silo formation
 – By ‘linking between different clinical or biomedical
   vocabularies’
 – However: ‘… the Metathesaurus does not represent a
   comprehensive NLM-authored ontology of biomedicine or a
   single consistent view of the world. The Metathesaurus
   preserves the many views of the world present in its source
   vocabularies because these different views may be useful for
   different tasks.’

   http://www.nlm.nih.gov/pubs/factsheets/umlsmeta.html
R T U New York State
        Center of Excellence in
        Bioinformatics & Life
        Sciences
Prospective standardization is a
           good thing
Prospective standardization is the only thing
  which will work in mission critical domains
Prospective standardization means that
  certain limits to tolerance must be imposed,
Need for top-down governance to ensure
  common architecture and resolution of
  border disputes in areas of overlap between
  domains
                                                 46
Principles of Best Practice in
   Ontology Development




                                 47
Problem of ensuring sensible
   cooperation in a massively
   interdisciplinary community
Consider multiple uses of technical terms
 such as
      − type
      − concept
      − instance
      − model
      − representation
      − data
                                            48
Three Levels

L3. Words, models (published
  representations, ontologies, databases ...)

L2. Ideas (concepts, thoughts, memories, ...)

L1. Things (cells, planets, processes of cell
  division ...)

                                                49
Entity =def

anything which exists, including things and
processes, functions and qualities, beliefs
and actions, documents and software
(entities on levels 1, 2 and 3)




                                              50
First basic distinction among entities

           type vs. instance

        (science text vs. diary)

    (human being vs. Tom Cruise)

                                         51
For ontologies


 it is generalizations that are
important = types, universals,
          kinds, species


                                  52
Catalog vs. inventory




A   515287   DC3300 Dust Collector Fan
B   521683   Gilmer Belt
C   521682   Motor Drive Belt
                                         53
An ontology is a representation
           of types

We learn about types in reality from looking
at the results of scientific experiments in the
form of scientific theories
experiments relate to what is particular
science describes what is general



                                                  54
Ontology =def.
   a representational artifact whose representational
   units (which may be drawn from a natural or from
   some formalized language) are intended to represent
       1. types in reality
       2. those relations between these types which
   obtain universally (= for all instances)
       lung is_a anatomical structure
       lobe of lung part_of lung
in accordance with our best current established science

                                                          55
types                                object


                              organism

                         animal

                mammal

          cat

                                         frog
siamese



instances
                                                56
Domain =def

a portion of reality that forms the subject-
matter of a single science or technology or
mode of study or administrative practice:
   proteomics
   epidemiology
   C2
   M&S
                                               57
Representation =def

an image, idea, map, picture, name or
description ... of some entity or entities.




                                              58
Ontologies are representational
            artifacts

  comparable to science texts
and subject to the same sorts of
 constraints (including need for
             update)


                                   59
Representational units =def

terms, icons, alphanumeric identifiers ...
which refer, or are intended to refer, to
entities
and which are minimal (atoms)




                                             60
Composite representation =def

representation
  (1) built out of representational units
which
  (2) form a structure that mirrors, or is intended
  to mirror, the entities in some domain




                                                      61
The Periodic Table
        Periodic Table




                         62
Ontologies are here




                      63
or here




          64
Ontologies represent general
  structures in reality (leg)




                                65
Ontologies do not represent
concepts in people’s heads




                              66
They represent types in reality




                                  67
How do we know which general
    terms designate types?

Types are repeatables:

   cell, electron, weapon, F16 ...

Instances are one-off:

   Bill Clinton, this laptop, this handwave
                                              68
Problem
The same general term can be used to
refer both to types and to collections of
particulars. Consider:

HIV is an infectious retrovirus
HIV is spreading very rapidly through Asia


                                             69
Class =def

a maximal collection of particulars
determined by a general term
(‘cell’, ‘electron’ but also: ‘ ‘restaurant in
Palo Alto’, ‘Italian’)

the class A
= the collection of all particulars x for
which ‘x is A’ is true

                                                 70
types vs. their extensions



          types


..}          collections of particulars




                                          71
Extension

=def The extension of a type is the class of its
 instances




                                                   72
types vs. classes



         types


{c,d,e,...}        classes




                             73
types vs. classes



           types


       extensions          other sorts of classes



compare: ‘natural kinds’
                                                    74
types vs. classes



types


             populations, ...
         the class of all diabetic
         patients in Leipzig on 4
         June 1952

                                     75
OWL is a good representation of
             classes

• F16s
• sibling of Finnish spy
• member of Abba aged > 50 years




                                    76
types, classes, concepts

types


        classes


         ‘concepts’        ?



                               77
types < classes < ‘concepts’ ?

Cases of ‘concepts’ which, some people say,
 do not correspond to classes:
     ‘Cancelled oophorectomy’
     ‘Absent nipple’
     ‘Unlocalized ligand’
A cancelled oophorectomy is not a special
  kind of conceptual oophorectory
Use: Information Artifact Ontology (IAO)
                                              78
Principle of Low Hanging Fruit

Include even absolutely trivial assertions
(assertions you know to be universally true)

   pneumococcal virus is_a virus

Computers need to be led by the hand


                                               79
Example: MeSH

MeSH Descriptors
 Index Medicus Descriptor
   Anthropology, Education, Sociology and
   Social Phenomena (MeSH Category)
      Social Sciences
           Political Systems
                  National Socialism

National Socialism is_a Political Systems
National Socialism is_a Anthropology ...

                                            80
Principle of Singular Nouns

   Terms in ontologies represent types

  Goal: Each term in an ontology should
         represent exactly one type

Thus every term should be a singular noun


                                            81
Principle: do not commit the use-
        mention confusion

mouse =def. common name for the species
 mus musculus



swimming is healthy and has eight letters


                                            82
Principle: do not commit the use-
        mention confusion
 Avoid confusing between words and things
 Avoid confusing between concepts in our
 minds and entities in reality

 Recommendation: avoid the word ‘concept’
 entirely


                                            83
Trialbank


‘information’ = def. ‘a written or spoken
   designation of a concept’




                                            84
Trialbank

‘Heparin therapy’ is an instance of ‘written or
  spoken designation of a concept’

  What are the problems here?
    1. misuse of quotation marks
    2. confusion of instances and types
    3. confusion of concept and reality

                                                  85
Principle: beware of
     terminological baggage
For the sake of interoperability with other
ontologies, do not give special meanings to
terms with established general meanings

(Don’t use ‘cell’ when you mean ‘plant cell’)




                                                86
ICNP: International Classification of
 Nursing Procedures (old version)
 water =def. a type of Nursing Phenomenon
 of Physical Environment with the specific
 characteristics: clear liquid compound of
 hydrogen and oxygen that is essential for
 most plant and animal life influencing life
 and development of human beings.


                                               87
Principle of definitions

Supply definitions for every term
1.human-understandable natural language
  definition
2.an equivalent formal definition




                                          88
Principle: definitions must be unique


Each term should have exactly one definition


it may have both natural-language and
   formal versions

(issue with ontologies which exist with
  different levels of expressivity)
                                               89
The Problem of Circularity

A Person =def. A person with an identity
               document

Hemolysis =def. The causes of hemolysis




                                           90
Principle of non-circularity


The term defined should not appear in its
             own definition




                                            91
Example: HL7

‘stopping a medication’ = def.
  change of state in the record of a
  Substance Administration Act from
  Active to Aborted




                                       92
Principle of Increase in
        Understandability
A definition should use only terms which are
easier to understand than the term defined

Definitions should not make simple things
more difficult than they are




                                               93
Generalized Tarski principle
(a good, general constraint on a
      theory of meaning)
   For each linguistic expression ‘E’


         ‘E’ means E
     ‘snow’ means: snow
‘pneumonia’ means: pneumonia
                                        94
HL7 Reference Information Model
‘medication’ does not mean: medication
rather it means:
  the record of medication in an information
  system

‘disease’ does not mean: disease
rather it means:
  the observation of a disease


                                               95
Principle of Acknowledging Primitives

 In every ontology some terms and some
 relations are primitive = they cannot be
 defined (on pain of infinite regress)
Examples of primitive relations:
  identity
  instance_of


                                            96
Principle of Aristotelian Definitions

        Use Aristotelian definitions

           An A is a B which C’s.

A human being is an animal which is rational



                                               97
Rules for Formulating Terms
Avoid abbreviations even when it is clear in
  context what they mean (‘breast’ for
  ‘breast tumor’)
Avoid acronyms
Avoid mass terms (‘tissue’, ‘brain mapping’,
  ‘clinical research’ ...)
Treat each term ‘A’ in an ontology is
  shorthand for a term of the form ‘the type
  A’

                                               98
Univocity
Terms should have the same meanings on
  every occasion of use.
(= They should refer to the same types)
Basic ontological relations such as is_a and
  part_of should be used in the same way
  by all ontologies



                                               99
Universality

Ontologies are made of relational
assertions
They should include only those which hold
universally




                                            100
Universality


Often, order will matter:

We can assert
  adult transformation_of child
but not
  child transforms_into adult

                                  101
Universality


 viral pneumonia caused by virus

but not
  virus causes pneumonia
  pneumococcal virus causes pneumonia



                                        102
Principle of Universality



 results analysis later_than protocol-design

but not

 protocol-design earlier_than results
 analysis
                                               103
Principle of Positivity
Complements of types are not themselves
 types.

Terms such as
 non-mammal
 non-membrane
 other metalworker in New Zealand
do not designate types in reality

                                          104
Generalized Anti-Boolean Principle

There are no conjunctive and disjunctive
 types:

 anatomic structure, system, or substance
 musculoskeletal and connective tissue
 disorder



                                            105
Objectivity
Which types exist in reality is not a function
  of our knowledge.
Terms such as
  unknown
  unclassified
  unlocalized
  arthropathies not otherwise specified
do not designate types in reality.
                                                 106
Keep Epistemology Separate from
             Ontology
If you want to say that
   We do not know where A’s are located
do not invent a new class of
   A’s with unknown locations
   (A well-constructed ontology should grow
   linearly; it should not need to delete classes
   or relations because of increases in
   knowledge)
                                                    107
Keep Sentences Separate from
           Terms
If you want to say

 I surmise that this is a case of pneumonia

do not invent a new class of surmised
  pneumonias

Confusion of ‘findings’ in medical terminologies
                                              108
Single Inheritance

No kind in a classificatory hierarchy
should be asserted to have more
than one is_a parent on the
immediate higher level




                                        109
Multiple Inheritance

                    thing



blue thing                     car

             is_a           is_a

               blue car
                                     110
Multiple Inheritance


is a source of errors
encourages laziness
serves as obstacle to integration with
  neighboring ontologies
hampers use of Aristotelian methodology for
  defining terms
hampers use of statistical search tools
                                              111
Multiple Inheritance

                     thing



blue thing                       car

             is_a1           is_a2

                blue car
                                       112
Principle of asserted single
            inheritance
Each reference ontology module should be
built as an asserted monohierarchy (a
hierarchy in which each term has at most
one parent)

Asserted hierarchy vs. inferred hierarchy



                                            113
Principle of normalization

Polyhierarchies should be decomposable
into homogeneous disjoint monohierarchies




                                            114
Principle of instantiability

A term should be included in an ontology
only if there is evidence that instances to
which that term refers exist or have existed
or can exist in reality.

   Fist
   Crowd

                                               115
Avoid mass nouns

Count nouns = an organism, a planet, a
 handshake
Mass nouns = tissue, information, discourse

Mass nouns almost always go hand in hand
 with ontological confusion


                                              116
is_a Overloading


The success of ontology alignment
demands that ontological relations (is_a,
part_of, ...) have the same meanings in the
different ontologies to be aligned.




                                              117
Multiple Inheritance

                     thing



blue thing                       car

             is_a1           is_a2

                blue car
                                       118
How to solve this problem

Create two ontologies:
  of cars
  of colors
Link the two together via cross-products
(= factoring, normalization, modularization)



                                               119
Compositionality
The meanings of compound terms should
  be determined
  1. by the meanings of component terms
together with
  2. the rules governing syntax




                                          120
User feedback principle

An ontology should evolve on the basis of
feedback derived from those who are using
the ontology for example for purposes in
annotation.




                                            121

Contenu connexe

En vedette

Restaurant and food ontologies
Restaurant and food ontologiesRestaurant and food ontologies
Restaurant and food ontologiesAnna Fensel
 
Rdf In A Nutshell V1
Rdf In A Nutshell V1Rdf In A Nutshell V1
Rdf In A Nutshell V1Fabien Gandon
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebJames Hendler
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008Jason Morris
 

En vedette (7)

OWL and OBO
OWL and OBOOWL and OBO
OWL and OBO
 
Restaurant and food ontologies
Restaurant and food ontologiesRestaurant and food ontologies
Restaurant and food ontologies
 
Rdf In A Nutshell V1
Rdf In A Nutshell V1Rdf In A Nutshell V1
Rdf In A Nutshell V1
 
Ontology Learning
Ontology LearningOntology Learning
Ontology Learning
 
On Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the WebOn Beyond OWL: challenges for ontologies on the Web
On Beyond OWL: challenges for ontologies on the Web
 
The Role Of Ontology In Modern Expert Systems Dallas 2008
The Role Of Ontology In Modern Expert Systems   Dallas   2008The Role Of Ontology In Modern Expert Systems   Dallas   2008
The Role Of Ontology In Modern Expert Systems Dallas 2008
 
Semantic web introduction
Semantic web introductionSemantic web introduction
Semantic web introduction
 

Similaire à Biomedical ontology tutorial_atlanta_june2011_part1

Tutorial what is_an_ontology_ncbo_march_2012
Tutorial what is_an_ontology_ncbo_march_2012Tutorial what is_an_ontology_ncbo_march_2012
Tutorial what is_an_ontology_ncbo_march_2012Barry Smith
 
Reasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentReasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentHilmar Lapp
 
Bringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentBringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentHilmar Lapp
 
Lab 12 Building Phylogenies Objectives .docx
Lab 12     Building Phylogenies    Objectives .docxLab 12     Building Phylogenies    Objectives .docx
Lab 12 Building Phylogenies Objectives .docxDIPESH30
 
Non keratinocytes&specialized mucosa
Non keratinocytes&specialized mucosaNon keratinocytes&specialized mucosa
Non keratinocytes&specialized mucosaAhmed Abulata
 
Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesKlaas Vandepoele
 
Uberon lausanne-2012
Uberon lausanne-2012Uberon lausanne-2012
Uberon lausanne-2012Chris Mungall
 
Eumicrobedb - Oomycetes Genomics Database
Eumicrobedb - Oomycetes Genomics Database Eumicrobedb - Oomycetes Genomics Database
Eumicrobedb - Oomycetes Genomics Database Arup Ghosh
 
thesis_AndreiaReis2014
thesis_AndreiaReis2014thesis_AndreiaReis2014
thesis_AndreiaReis2014Andreia Reis
 
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgetaxonbytes
 
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...taxonbytes
 
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variableFranz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variabletaxonbytes
 
Rhetoric in the Twenty First Century
Rhetoric in the Twenty First CenturyRhetoric in the Twenty First Century
Rhetoric in the Twenty First CenturyStephen Pain
 
A nano-reference-system based on two orthogonal (molecular) micro-goniometers...
A nano-reference-system based on two orthogonal (molecular) micro-goniometers...A nano-reference-system based on two orthogonal (molecular) micro-goniometers...
A nano-reference-system based on two orthogonal (molecular) micro-goniometers...IJERA Editor
 
2.Describe a technique that would allow a developmental biologist to.pdf
2.Describe a technique that would allow a developmental biologist to.pdf2.Describe a technique that would allow a developmental biologist to.pdf
2.Describe a technique that would allow a developmental biologist to.pdfarrowcomputers8700
 
Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?robertstevens65
 
Neural crest cell migration-Cell tracing techniques.
Neural crest cell migration-Cell tracing techniques.Neural crest cell migration-Cell tracing techniques.
Neural crest cell migration-Cell tracing techniques.sanjeev jain
 

Similaire à Biomedical ontology tutorial_atlanta_june2011_part1 (20)

Tutorial what is_an_ontology_ncbo_march_2012
Tutorial what is_an_ontology_ncbo_march_2012Tutorial what is_an_ontology_ncbo_march_2012
Tutorial what is_an_ontology_ncbo_march_2012
 
Reasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descentReasoning over phenotype diversity, character change, and evolutionary descent
Reasoning over phenotype diversity, character change, and evolutionary descent
 
Bringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descentBringing reason to phenotype diversity, character change, and common descent
Bringing reason to phenotype diversity, character change, and common descent
 
Lab 12 Building Phylogenies Objectives .docx
Lab 12     Building Phylogenies    Objectives .docxLab 12     Building Phylogenies    Objectives .docx
Lab 12 Building Phylogenies Objectives .docx
 
Non keratinocytes&specialized mucosa
Non keratinocytes&specialized mucosaNon keratinocytes&specialized mucosa
Non keratinocytes&specialized mucosa
 
Detection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomesDetection of genomic homology in eukaryotic genomes
Detection of genomic homology in eukaryotic genomes
 
Uberon lausanne-2012
Uberon lausanne-2012Uberon lausanne-2012
Uberon lausanne-2012
 
Eumicrobedb - Oomycetes Genomics Database
Eumicrobedb - Oomycetes Genomics Database Eumicrobedb - Oomycetes Genomics Database
Eumicrobedb - Oomycetes Genomics Database
 
thesis_AndreiaReis2014
thesis_AndreiaReis2014thesis_AndreiaReis2014
thesis_AndreiaReis2014
 
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledgeFranz 2017 uiuc cirss non unitary syntheses of systematic knowledge
Franz 2017 uiuc cirss non unitary syntheses of systematic knowledge
 
Genetic algorithms
Genetic algorithms Genetic algorithms
Genetic algorithms
 
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
Franz 2015 SPNHC Taxonomic concept resolution for voucher-based biodiversity ...
 
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variableFranz et al 2015 escjam 2015 logic resolution taxonomic variable
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
 
Rhetoric in the Twenty First Century
Rhetoric in the Twenty First CenturyRhetoric in the Twenty First Century
Rhetoric in the Twenty First Century
 
A nano-reference-system based on two orthogonal (molecular) micro-goniometers...
A nano-reference-system based on two orthogonal (molecular) micro-goniometers...A nano-reference-system based on two orthogonal (molecular) micro-goniometers...
A nano-reference-system based on two orthogonal (molecular) micro-goniometers...
 
Rhizomatic Philosophy
Rhizomatic PhilosophyRhizomatic Philosophy
Rhizomatic Philosophy
 
Ontology Poster
Ontology PosterOntology Poster
Ontology Poster
 
2.Describe a technique that would allow a developmental biologist to.pdf
2.Describe a technique that would allow a developmental biologist to.pdf2.Describe a technique that would allow a developmental biologist to.pdf
2.Describe a technique that would allow a developmental biologist to.pdf
 
Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?Can there be such a thing as Ontology Engineering?
Can there be such a thing as Ontology Engineering?
 
Neural crest cell migration-Cell tracing techniques.
Neural crest cell migration-Cell tracing techniques.Neural crest cell migration-Cell tracing techniques.
Neural crest cell migration-Cell tracing techniques.
 

Plus de Barry Smith

Towards an Ontology of Philosophy
Towards an Ontology of PhilosophyTowards an Ontology of Philosophy
Towards an Ontology of PhilosophyBarry Smith
 
An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...Barry Smith
 
Ways of Worldmarking: The Ontology of the Eruv
Ways of Worldmarking: The Ontology of the EruvWays of Worldmarking: The Ontology of the Eruv
Ways of Worldmarking: The Ontology of the EruvBarry Smith
 
The Division of Deontic Labor
The Division of Deontic LaborThe Division of Deontic Labor
The Division of Deontic LaborBarry Smith
 
Ontology of Aging (August 2014)
Ontology of Aging (August 2014)Ontology of Aging (August 2014)
Ontology of Aging (August 2014)Barry Smith
 
The Fifth Cycle of Philosophy
The Fifth Cycle of PhilosophyThe Fifth Cycle of Philosophy
The Fifth Cycle of PhilosophyBarry Smith
 
Ontology of Poker
Ontology of PokerOntology of Poker
Ontology of PokerBarry Smith
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Barry Smith
 
Enhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort DataEnhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort DataBarry Smith
 
The Philosophome: An Exercise in the Ontology of the Humanities
The Philosophome: An Exercise in the Ontology of the HumanitiesThe Philosophome: An Exercise in the Ontology of the Humanities
The Philosophome: An Exercise in the Ontology of the HumanitiesBarry Smith
 
IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
IAO-Intel: An Ontology of Information Artifacts in the Intelligence DomainIAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
IAO-Intel: An Ontology of Information Artifacts in the Intelligence DomainBarry Smith
 
Science of Emerging Social Media
Science of Emerging Social MediaScience of Emerging Social Media
Science of Emerging Social MediaBarry Smith
 
Ethics, Informatics and Obamacare
Ethics, Informatics and ObamacareEthics, Informatics and Obamacare
Ethics, Informatics and ObamacareBarry Smith
 
e‐Human Beings: The contribution of internet ranking systems to the developme...
e‐Human Beings: The contribution of internet ranking systems to the developme...e‐Human Beings: The contribution of internet ranking systems to the developme...
e‐Human Beings: The contribution of internet ranking systems to the developme...Barry Smith
 
Ontology of aging and death
Ontology of aging and deathOntology of aging and death
Ontology of aging and deathBarry Smith
 
Ontology in-buffalo-2013
Ontology in-buffalo-2013Ontology in-buffalo-2013
Ontology in-buffalo-2013Barry Smith
 
ImmPort strategies to enhance discoverability of clinical trial data
ImmPort strategies to enhance discoverability of clinical trial dataImmPort strategies to enhance discoverability of clinical trial data
ImmPort strategies to enhance discoverability of clinical trial dataBarry Smith
 
Ontology of Documents (2005)
Ontology of Documents (2005)Ontology of Documents (2005)
Ontology of Documents (2005)Barry Smith
 
Ontology and the National Cancer Institute Thesaurus (2005)
Ontology and the National Cancer Institute Thesaurus (2005)Ontology and the National Cancer Institute Thesaurus (2005)
Ontology and the National Cancer Institute Thesaurus (2005)Barry Smith
 

Plus de Barry Smith (20)

Towards an Ontology of Philosophy
Towards an Ontology of PhilosophyTowards an Ontology of Philosophy
Towards an Ontology of Philosophy
 
An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...An application of Basic Formal Ontology to the Ontology of Services and Commo...
An application of Basic Formal Ontology to the Ontology of Services and Commo...
 
Ways of Worldmarking: The Ontology of the Eruv
Ways of Worldmarking: The Ontology of the EruvWays of Worldmarking: The Ontology of the Eruv
Ways of Worldmarking: The Ontology of the Eruv
 
The Division of Deontic Labor
The Division of Deontic LaborThe Division of Deontic Labor
The Division of Deontic Labor
 
Ontology of Aging (August 2014)
Ontology of Aging (August 2014)Ontology of Aging (August 2014)
Ontology of Aging (August 2014)
 
Meaningful Use
Meaningful UseMeaningful Use
Meaningful Use
 
The Fifth Cycle of Philosophy
The Fifth Cycle of PhilosophyThe Fifth Cycle of Philosophy
The Fifth Cycle of Philosophy
 
Ontology of Poker
Ontology of PokerOntology of Poker
Ontology of Poker
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
 
Enhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort DataEnhancing the Quality of ImmPort Data
Enhancing the Quality of ImmPort Data
 
The Philosophome: An Exercise in the Ontology of the Humanities
The Philosophome: An Exercise in the Ontology of the HumanitiesThe Philosophome: An Exercise in the Ontology of the Humanities
The Philosophome: An Exercise in the Ontology of the Humanities
 
IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
IAO-Intel: An Ontology of Information Artifacts in the Intelligence DomainIAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
IAO-Intel: An Ontology of Information Artifacts in the Intelligence Domain
 
Science of Emerging Social Media
Science of Emerging Social MediaScience of Emerging Social Media
Science of Emerging Social Media
 
Ethics, Informatics and Obamacare
Ethics, Informatics and ObamacareEthics, Informatics and Obamacare
Ethics, Informatics and Obamacare
 
e‐Human Beings: The contribution of internet ranking systems to the developme...
e‐Human Beings: The contribution of internet ranking systems to the developme...e‐Human Beings: The contribution of internet ranking systems to the developme...
e‐Human Beings: The contribution of internet ranking systems to the developme...
 
Ontology of aging and death
Ontology of aging and deathOntology of aging and death
Ontology of aging and death
 
Ontology in-buffalo-2013
Ontology in-buffalo-2013Ontology in-buffalo-2013
Ontology in-buffalo-2013
 
ImmPort strategies to enhance discoverability of clinical trial data
ImmPort strategies to enhance discoverability of clinical trial dataImmPort strategies to enhance discoverability of clinical trial data
ImmPort strategies to enhance discoverability of clinical trial data
 
Ontology of Documents (2005)
Ontology of Documents (2005)Ontology of Documents (2005)
Ontology of Documents (2005)
 
Ontology and the National Cancer Institute Thesaurus (2005)
Ontology and the National Cancer Institute Thesaurus (2005)Ontology and the National Cancer Institute Thesaurus (2005)
Ontology and the National Cancer Institute Thesaurus (2005)
 

Biomedical ontology tutorial_atlanta_june2011_part1

  • 1. How to Build a Biomedical Ontology Success Stories The Gene Ontology (GO) SNOMED, ICD and other controlled vocabularies Ontology Design Principles Ontology Applications Barry Smith http://ontology.buffalo.edu/smith
  • 2. Uses of ‘ontology’ in PubMed abstracts 2
  • 3. 3
  • 4. By far the most successful: GO (Gene Ontology) 4
  • 5. 5
  • 6. Hierarchical view of GO representing relations between represented types 6
  • 7. Gene Ontology $100 mill. invested in literature and database curation using the Gene Ontology (GO) based on the idea of annotation over 11 million annotations relating gene products (proteins) described in the UniProt, Ensembl and other databases to terms in the GO multiple secondary uses – because the ontology was not built to meet one specific set of requirements 7
  • 8. GO provides a controlled system of terms for use in annotating (describing, tagging) data • multi-species, multi-disciplinary, open source • contributing to the cumulativity of scientific results obtained by distinct research communities • compare use of kilograms, meters, seconds in formulating experimental results 8
  • 10. semantic annotation of data where in the cell ? what kind of molecular function ? what kind of biological process? 10
  • 11. natural language labels to make the data cognitively accessible to human beings 11
  • 13. compare: legends for diagrams 13
  • 14. ontologies are legends for data 14
  • 16. ontologies are legends for images 16
  • 17. what lesion ? what brain function ? 17
  • 18. ontologies are legends for databases MouseEcotope GlyProt sphingolipid transporter activity DiabetInGene GluChem 18
  • 19. annotation using common ontologies yields integration of databases MouseEcotope GlyProt Holliday junction helicase complex DiabetInGene GluChem 19
  • 20. annotation using common ontologies can support comparison of data 20
  • 21. annotation with Gene Ontology supports reusability of data supports search of data by humans supports comparison of data supports aggregation of data supports reasoning with data by humans and machines 21
  • 22. 22
  • 23. The goal: virtual science • consistent (non-redundant) annotation • cumulative (additive) annotation yielding, by incremental steps, a virtual map of the entirety of reality that is accessible to computational reasoning 23
  • 24. This goal is realizable if we have a common ontology framework data is retrievable data is comparable data is integratable only to the degree that it is annotated using a common controlled vocabulary – compare the role of seconds, meters, kilograms … in unifying science 24
  • 25. To achieve this end we have to engage in something like philosophy (?) is this the right way to organize the top level of this portion of the GO? how does the top level of this ontology relate to the top levels of other, neighboring ontologies? 25
  • 26. Strategy for doing this see the world as organized via types/universals/categories which are hierarchically organized and in relation to which statements can be formulated which are universally true of all instances: cell membrane part_of cell 26
  • 27. Anatomical Anatomical Space Structure Organ Cavity Organ Organ Organ Part Subdivision Cavity Serous Sac Serous Sac Organ Organ Cavity Cavity Serous Sac Component Subdivision Tissue Subdivision is_a Pleural Sac Pleural Sac Pleura(Wall Pleural Pleura(Wall Pleural of Sac) of Sac) Cavity of Cavity Parietal Parietal Pleura t_ Pleura Visceral Visceral Interlobar Pleura Pleura Interlobar r recess recess Mediastinal pa Mediastinal Pleura Pleura Mesothelium Mesothelium of Pleura of Pleura 27 Foundational Model of Anatomy Ontology
  • 28. species, substance genera organism animal mammal cat frog siamese instances 28
  • 29. 29
  • 30. the problem of continuity of care: patients move around with thanks to http://dbmotion.com 30
  • 31. f f f f f f synchronic and diachronic problems of semantic interoperability (across space and across time) 31
  • 32. f f f f EHR 1 EHR 2 f f how can we link EHR 1 to EHR 2 in a reliable, trustworthy, useful way, which both systems can understand ? 32
  • 33. f f f ICD f EHR 1 EHR 2 f f the ideal solution: WHO International Classification of Diseases 33
  • 34. ICD PRO: De facto US billing standard Multilanguage CON: De facto US billing standard (corrupts data) No definitions of terms, and so difficult to judge accuracy of hierarchy and of coding Inconsistent hierarchies Hard to reason with results Hence few secondary uses e.g. for research 34
  • 35. ICD 11 The (ontology-based) plan multiple views including ◦ billing ◦ public health statistics ◦ research ◦ SNOMED compatibility 35
  • 36. f f f SNOMED-CT f EHR 1 EHR 2 f f the ideal solution: a single universal clinical vocabulary 36
  • 37. SNOMED CT: Systematized Nomenclature of Medicine-Clinical Terms PRO: International standard (sort of) Huge resource Free for member countries Multi-language (including Spanish) 37
  • 38. SNOMED CT CON Huge (but redundant ... and gappy) Contains many examples of false synonymy Still in need of work ◦ No consistent interpretation of relations ◦ Many erroneous relation assertions ◦ Many idiosyncratic relations ◦ Mixes ontology with epistemology ◦ It contains numerous compound terms (e.g., test for X) without the constituent terms (here: X), even where the latter are of obvious salience 38
  • 39. SNOMED CT Coding with SNOMED-CT is unreliable and inconsistent Multi-stage multi-committee process for adding terms that follows intuitive rules and not formal principles Does there exist a strategy for evolutionary improvement? 39
  • 40. f f f SNOMED-CT f EHR 1 EHR 2 f fan above all: SNOMED CT cannot solve the problem of continuity of care because it has too much redundancy 40
  • 41. f f f SNOMED-CT f EHR 1 EHR 2 f fan AND because it is used only in certain countries 41
  • 42. f f Unified Medical f Language System (UMLS) f EHR 1 EHR 2 f f link EHR 1 to EHR 2 through a snapshot of the patient’s condition which both systems can understand 42
  • 43. Unified Medical Language System (UMLS) UMLS is not unified, not a language, not a system (and not only medical); it is an aggregation If we use something like UMLS as reference terminology, we will not solve the translation problem EN DE
  • 44. R T U New York State Center of Excellence in Bioinformatics & Life Sciences UMLS approach to countering silo formation – By ‘linking between different clinical or biomedical vocabularies’ – However: ‘… the Metathesaurus does not represent a comprehensive NLM-authored ontology of biomedicine or a single consistent view of the world. The Metathesaurus preserves the many views of the world present in its source vocabularies because these different views may be useful for different tasks.’ http://www.nlm.nih.gov/pubs/factsheets/umlsmeta.html
  • 45. R T U New York State Center of Excellence in Bioinformatics & Life Sciences
  • 46. Prospective standardization is a good thing Prospective standardization is the only thing which will work in mission critical domains Prospective standardization means that certain limits to tolerance must be imposed, Need for top-down governance to ensure common architecture and resolution of border disputes in areas of overlap between domains 46
  • 47. Principles of Best Practice in Ontology Development 47
  • 48. Problem of ensuring sensible cooperation in a massively interdisciplinary community Consider multiple uses of technical terms such as − type − concept − instance − model − representation − data 48
  • 49. Three Levels L3. Words, models (published representations, ontologies, databases ...) L2. Ideas (concepts, thoughts, memories, ...) L1. Things (cells, planets, processes of cell division ...) 49
  • 50. Entity =def anything which exists, including things and processes, functions and qualities, beliefs and actions, documents and software (entities on levels 1, 2 and 3) 50
  • 51. First basic distinction among entities type vs. instance (science text vs. diary) (human being vs. Tom Cruise) 51
  • 52. For ontologies it is generalizations that are important = types, universals, kinds, species 52
  • 53. Catalog vs. inventory A 515287 DC3300 Dust Collector Fan B 521683 Gilmer Belt C 521682 Motor Drive Belt 53
  • 54. An ontology is a representation of types We learn about types in reality from looking at the results of scientific experiments in the form of scientific theories experiments relate to what is particular science describes what is general 54
  • 55. Ontology =def. a representational artifact whose representational units (which may be drawn from a natural or from some formalized language) are intended to represent 1. types in reality 2. those relations between these types which obtain universally (= for all instances) lung is_a anatomical structure lobe of lung part_of lung in accordance with our best current established science 55
  • 56. types object organism animal mammal cat frog siamese instances 56
  • 57. Domain =def a portion of reality that forms the subject- matter of a single science or technology or mode of study or administrative practice: proteomics epidemiology C2 M&S 57
  • 58. Representation =def an image, idea, map, picture, name or description ... of some entity or entities. 58
  • 59. Ontologies are representational artifacts comparable to science texts and subject to the same sorts of constraints (including need for update) 59
  • 60. Representational units =def terms, icons, alphanumeric identifiers ... which refer, or are intended to refer, to entities and which are minimal (atoms) 60
  • 61. Composite representation =def representation (1) built out of representational units which (2) form a structure that mirrors, or is intended to mirror, the entities in some domain 61
  • 62. The Periodic Table Periodic Table 62
  • 64. or here 64
  • 65. Ontologies represent general structures in reality (leg) 65
  • 66. Ontologies do not represent concepts in people’s heads 66
  • 67. They represent types in reality 67
  • 68. How do we know which general terms designate types? Types are repeatables: cell, electron, weapon, F16 ... Instances are one-off: Bill Clinton, this laptop, this handwave 68
  • 69. Problem The same general term can be used to refer both to types and to collections of particulars. Consider: HIV is an infectious retrovirus HIV is spreading very rapidly through Asia 69
  • 70. Class =def a maximal collection of particulars determined by a general term (‘cell’, ‘electron’ but also: ‘ ‘restaurant in Palo Alto’, ‘Italian’) the class A = the collection of all particulars x for which ‘x is A’ is true 70
  • 71. types vs. their extensions types ..} collections of particulars 71
  • 72. Extension =def The extension of a type is the class of its instances 72
  • 73. types vs. classes types {c,d,e,...} classes 73
  • 74. types vs. classes types extensions other sorts of classes compare: ‘natural kinds’ 74
  • 75. types vs. classes types populations, ... the class of all diabetic patients in Leipzig on 4 June 1952 75
  • 76. OWL is a good representation of classes • F16s • sibling of Finnish spy • member of Abba aged > 50 years 76
  • 77. types, classes, concepts types classes ‘concepts’ ? 77
  • 78. types < classes < ‘concepts’ ? Cases of ‘concepts’ which, some people say, do not correspond to classes: ‘Cancelled oophorectomy’ ‘Absent nipple’ ‘Unlocalized ligand’ A cancelled oophorectomy is not a special kind of conceptual oophorectory Use: Information Artifact Ontology (IAO) 78
  • 79. Principle of Low Hanging Fruit Include even absolutely trivial assertions (assertions you know to be universally true) pneumococcal virus is_a virus Computers need to be led by the hand 79
  • 80. Example: MeSH MeSH Descriptors Index Medicus Descriptor Anthropology, Education, Sociology and Social Phenomena (MeSH Category) Social Sciences Political Systems National Socialism National Socialism is_a Political Systems National Socialism is_a Anthropology ... 80
  • 81. Principle of Singular Nouns Terms in ontologies represent types Goal: Each term in an ontology should represent exactly one type Thus every term should be a singular noun 81
  • 82. Principle: do not commit the use- mention confusion mouse =def. common name for the species mus musculus swimming is healthy and has eight letters 82
  • 83. Principle: do not commit the use- mention confusion Avoid confusing between words and things Avoid confusing between concepts in our minds and entities in reality Recommendation: avoid the word ‘concept’ entirely 83
  • 84. Trialbank ‘information’ = def. ‘a written or spoken designation of a concept’ 84
  • 85. Trialbank ‘Heparin therapy’ is an instance of ‘written or spoken designation of a concept’ What are the problems here? 1. misuse of quotation marks 2. confusion of instances and types 3. confusion of concept and reality 85
  • 86. Principle: beware of terminological baggage For the sake of interoperability with other ontologies, do not give special meanings to terms with established general meanings (Don’t use ‘cell’ when you mean ‘plant cell’) 86
  • 87. ICNP: International Classification of Nursing Procedures (old version) water =def. a type of Nursing Phenomenon of Physical Environment with the specific characteristics: clear liquid compound of hydrogen and oxygen that is essential for most plant and animal life influencing life and development of human beings. 87
  • 88. Principle of definitions Supply definitions for every term 1.human-understandable natural language definition 2.an equivalent formal definition 88
  • 89. Principle: definitions must be unique Each term should have exactly one definition it may have both natural-language and formal versions (issue with ontologies which exist with different levels of expressivity) 89
  • 90. The Problem of Circularity A Person =def. A person with an identity document Hemolysis =def. The causes of hemolysis 90
  • 91. Principle of non-circularity The term defined should not appear in its own definition 91
  • 92. Example: HL7 ‘stopping a medication’ = def. change of state in the record of a Substance Administration Act from Active to Aborted 92
  • 93. Principle of Increase in Understandability A definition should use only terms which are easier to understand than the term defined Definitions should not make simple things more difficult than they are 93
  • 94. Generalized Tarski principle (a good, general constraint on a theory of meaning) For each linguistic expression ‘E’ ‘E’ means E ‘snow’ means: snow ‘pneumonia’ means: pneumonia 94
  • 95. HL7 Reference Information Model ‘medication’ does not mean: medication rather it means: the record of medication in an information system ‘disease’ does not mean: disease rather it means: the observation of a disease 95
  • 96. Principle of Acknowledging Primitives In every ontology some terms and some relations are primitive = they cannot be defined (on pain of infinite regress) Examples of primitive relations: identity instance_of 96
  • 97. Principle of Aristotelian Definitions Use Aristotelian definitions An A is a B which C’s. A human being is an animal which is rational 97
  • 98. Rules for Formulating Terms Avoid abbreviations even when it is clear in context what they mean (‘breast’ for ‘breast tumor’) Avoid acronyms Avoid mass terms (‘tissue’, ‘brain mapping’, ‘clinical research’ ...) Treat each term ‘A’ in an ontology is shorthand for a term of the form ‘the type A’ 98
  • 99. Univocity Terms should have the same meanings on every occasion of use. (= They should refer to the same types) Basic ontological relations such as is_a and part_of should be used in the same way by all ontologies 99
  • 100. Universality Ontologies are made of relational assertions They should include only those which hold universally 100
  • 101. Universality Often, order will matter: We can assert adult transformation_of child but not child transforms_into adult 101
  • 102. Universality viral pneumonia caused by virus but not virus causes pneumonia pneumococcal virus causes pneumonia 102
  • 103. Principle of Universality results analysis later_than protocol-design but not protocol-design earlier_than results analysis 103
  • 104. Principle of Positivity Complements of types are not themselves types. Terms such as non-mammal non-membrane other metalworker in New Zealand do not designate types in reality 104
  • 105. Generalized Anti-Boolean Principle There are no conjunctive and disjunctive types: anatomic structure, system, or substance musculoskeletal and connective tissue disorder 105
  • 106. Objectivity Which types exist in reality is not a function of our knowledge. Terms such as unknown unclassified unlocalized arthropathies not otherwise specified do not designate types in reality. 106
  • 107. Keep Epistemology Separate from Ontology If you want to say that We do not know where A’s are located do not invent a new class of A’s with unknown locations (A well-constructed ontology should grow linearly; it should not need to delete classes or relations because of increases in knowledge) 107
  • 108. Keep Sentences Separate from Terms If you want to say I surmise that this is a case of pneumonia do not invent a new class of surmised pneumonias Confusion of ‘findings’ in medical terminologies 108
  • 109. Single Inheritance No kind in a classificatory hierarchy should be asserted to have more than one is_a parent on the immediate higher level 109
  • 110. Multiple Inheritance thing blue thing car is_a is_a blue car 110
  • 111. Multiple Inheritance is a source of errors encourages laziness serves as obstacle to integration with neighboring ontologies hampers use of Aristotelian methodology for defining terms hampers use of statistical search tools 111
  • 112. Multiple Inheritance thing blue thing car is_a1 is_a2 blue car 112
  • 113. Principle of asserted single inheritance Each reference ontology module should be built as an asserted monohierarchy (a hierarchy in which each term has at most one parent) Asserted hierarchy vs. inferred hierarchy 113
  • 114. Principle of normalization Polyhierarchies should be decomposable into homogeneous disjoint monohierarchies 114
  • 115. Principle of instantiability A term should be included in an ontology only if there is evidence that instances to which that term refers exist or have existed or can exist in reality. Fist Crowd 115
  • 116. Avoid mass nouns Count nouns = an organism, a planet, a handshake Mass nouns = tissue, information, discourse Mass nouns almost always go hand in hand with ontological confusion 116
  • 117. is_a Overloading The success of ontology alignment demands that ontological relations (is_a, part_of, ...) have the same meanings in the different ontologies to be aligned. 117
  • 118. Multiple Inheritance thing blue thing car is_a1 is_a2 blue car 118
  • 119. How to solve this problem Create two ontologies: of cars of colors Link the two together via cross-products (= factoring, normalization, modularization) 119
  • 120. Compositionality The meanings of compound terms should be determined 1. by the meanings of component terms together with 2. the rules governing syntax 120
  • 121. User feedback principle An ontology should evolve on the basis of feedback derived from those who are using the ontology for example for purposes in annotation. 121

Notes de l'éditeur

  1. dir.niehs.nih.gov/ microarray/datamining/
  2. dir.niehs.nih.gov/ microarray/datamining/
  3. http://www.ags.gov.ab.ca/GRAPHICS/uranium/athabasca_group_map_with_legend.jpg
  4. dir.niehs.nih.gov/ microarray/datamining/
  5. http://www.ags.gov.ab.ca/GRAPHICS/uranium/athabasca_group_map_with_legend.jpg
  6. with thanks to Bill Hogan
  7. (with thanks to Bill Hogan)
  8. http://yuri.lbl.gov/amigo/ct?
  9. Problem example: ‘chromosome’ in Sequence Ontology and in Cell Component Ontology means different things Current solution: two distinct terms involved (qualified by respective namespace)
  10. There is no species called ‘non-rabbit’
  11. There is no biological species: unknown rabbit. See discussion below.