- Ontology engineering aims to apply principles of engineering such as predictability, reproducibility, and strict semantics to the development of ontologies.
- Current ontology development relies heavily on craft and individual expertise rather than established engineering processes.
- For ontology engineering to be established, methods are needed to standardize development practices, evaluate ontologies, and demonstrate that independent groups can engineer ontologies to meet requirements in a consistent manner. The field is still in its early stages of applying engineering rigor.
1. Can there be such a thing as Ontology Engineering?
Robert Stevens
BioHealth Informatics Group
University of Manchester
2. Introduction
A bit of ontology introduction if required;
What is engineering?
Predictability in ontology engineering
The application of deterministic principles
The role of strict semantics
The role of philosophy
Acquiring some level of reproducibility.
3. A World of Instances
The world (of information) is made up of things and lots of them
Instances, individuals, objects, tokens, particulars.
The Earth is a kind of Planet
Robert Stevens (NE 67 41 58 A) is a Person
All the individual Alpha Haemoglobins in my many Instances of Red Blood
Cell
Each cell instance in my Body has copies of some 30,000 Genes
A Word, language, idea, etc.
This Table, those Chairs,
Any Thing with “A”, “The”, “That”, etc. before it….
4. We Put things into Categories
All these instances hang about making our world
Putting these things into categories is a fundamental part of human
cognition
Psychologists study this as concept formation
The same instances are put into a category
The capitalised and italicised in the slide before last
5. We have Labels for the Categories and their Instances
We label categories with symbols: Words
“Lion” is a category of big cat with big teeth
Gene, Protein, Cell, Person, Hydrolase Activity, etc.
…and, as we’ve already seen, each category can have many labels and
any particular label can refer to more than one category
Semantic Heterogeneity
“A lion” is an instance in that category
Does the category “Lion” exist?
Lions exist, but the category could just be a human way of talking about
lions
… we like putting things into categories
6. A Controlled Vocabulary
A specified set of words and phrases for the
categories in which we place instances
Natural language definitions for those words and
phrases
A glossary defines, but doesn’t control
The Uniprot keywords define and control
Control is placed upon which labels are used to
represent the categories (concepts) we’ve used to
describe the instances in the world
…, but there is nothing about how things in these
categories are related
Biopolymer
DNA
Enzyme
Nucleic acid
mRNA
Polypeptide
snRNA
tRNA
7. We also like to Relate Things Together
Categories have subcategories
Instances in one category can be related
in some way to instances in another
Can relate instances to each other in
many different ways
Is-a, part-of, develops-from, etc.axes
We can use these relationships to classify
categories
Things in category A are part is
If all instances in category A are also in
category B then As are kinds of Bs
Biopolymer
Nucleic Acid Polypeptide
Enzym
e
DNA RNA
tRNA mRNA smRNA
9. Describing Category Membership
We can make conditions that any instance must fulfil in order to be a
member of a particular category
A Phosphatase must have a phosphatase catalytic domain
A Receptor must have a transmembrane domain
A codon has three nucleotide residues
A limb has part that is a joint
A man has a Y chromosome and an X chromosome
A woman has only an X chromosome
10. Relationships
These conditions made from a property and a successor
relationship
isPartOf, hasPart
isDerivedFrom
DevelopsFrom
isHomologousTo
…and many, many more
11. A Structured Controlled Vocabulary
Not only can we agree on the
labels we give categories
Can also agree on how the
instances of categories are
related
And agree on the labels we give
he relations
Structure aids querying and
captures knowledge with greater
fidelity
Biopolymer
Nucleic Acid Polypeptide
Enzym
e
DNA RNA
tRNA mRNA smRNA
Gene
regionOf
transcribedFrom
translatedFrom
12. Manchester Mercury
January 1st 1754
Executed 18
Found Dead 34
Frighted 2
Kill'd by falls and other accidents
55
Kill'd themselves 36
Murdered 3
Overlaid 40
Poisoned 1
Scalded 5
Smothered 1
Stabbed 1
Starved 7
Suffocated 5
Aged 1456
Consumption 3915
Convulsion 5977
Dropsy 794
Fevers 2292
Smallpox 774
Teeth 961
Bit by mad dogs 3
Broken Limbs 5
Bruised 5
Burnt 9
Drowned 86
Excessive Drinking 15
List of diseases &
casualties this year
19276 burials
15444 christenings
Deaths by centile
14. What is engineering?
American Engineers' Council for Professional
Development defines "engineering" as:
“The creative application of scientific principles to design
or develop structures, machines, apparatus, or
manufacturing processes, or works utilizing them singly
or in combination; or to construct or operate the same
with full cognizance of their design; or to forecast their
behavior under specific operating conditions; all as
respects an intended function, economics of operation
and safety to life and property.[2]”
Taken from http://en.wikipedia.org/wiki/Engineering
15. What Type of Artefact? The Rise of the Computer Science
Ontology
A term borrowed from philosophy
Not supposed to be the same thing, but…
Meant to deliver formal, computational semantics to
applications and humans
Necessarily involves consensus
17. Where are we in the Development of Ontology Engineering?
At about 1975…
There’s a lot of craft involved;
Too much reliance on gurus
Could two independent sets of ontologist develop two
ontologies for the same domain with the same utility?
Can we cost ontology building?
Do we know when we have succcess?
19. Something a bit more agile
06/27/14
21
Requirements, scoping,
Competency questions
Knowledge acquisition
Conceptualisation, pattern forming
Axiomatization
Testing / evaluation?
Repeated,
small
iterations
Repeated,
small
iterations
Users always
involved
Users always
involved
20. Four Broad Areas of Ontology Engineering
1. Technical aspects: Code repositories, issue trackers,
editors, and so on
2. Coding styles and naming conventions, etc.
3. Choosing a class, placing it in a hierarchy and choosing
relationships and entities by which it is described.
4. The rhetoric behind how (2) and (3) are done. One can
have philosophical justification for any decision, or it can
just be practically useful….
21. Getting the Requirements Right
Truth and beauty is an easy requirement to state
Just model the world as it is and all else wil flow from this;
Not necessarily helpful;
Have to set a scope;
Have to set priorities – what do we most need to represent?
Competency questions – what do I need to be able to answer?
Separating “what the ontology must answer” and “what the ontology
must enable to be answered”;
Requirements change; keeping it “agile”
Setting priorities.
22. Strict Semantics
Languages such as OWL have a strict semantics;
Statements have a precise and interpretable meaning;
Deductions can follow from a series of statements;
Can be used to aid development and use of the ontology
23. Correct, but Wrong…
An automated reasoner for OWL can make sure all your
axioms are coherent;
One can make sure the ontology is structurally robust
The statements in the ontology can stil be rubbish
though…
A strict semantics lends some kind of predictability to an
ontology;
A pure description logic approach of all defined classes
has some appeal…
24. Total Definition
In OWL a defined class can find its own place in the hierarchy
A parent is any person that has a child;
A mother is any woman that has a child;
As a woman is a kind of person, we can infer a mother to be a kind
of parent;
Do this for all classes; press the button and you have an ontology
Definition is hard (but that may be a good thing) and the tools may
lack
Requires discipline from the authors
…and it all grounds out to a primitive somewhere along the line…
25. Normalisation
An “engineering” method to manage polyhierarchies in
ontology through reasoning;
Make a strict tree of primitive classes using one criterion;
Put all other criteria as restrictions upon those classes;
Re-establish the polyhierarchy through defined classes
with the “other” criteria….
http://ontogenesis.knowledgeblog.org/49
26. Authoring Tools
These are really just axiom editors
Support for the surrounding processes are nascent
Lots of “hand-crafting” of even large ontologies
Knowledge gathering tools; organising tools; axiom
generation tools; checking and validation tools; …
28. Patterns and Components
Software Design Patterns: Accepted design solutions to
common problems;
Application building at the level of components;
Design pattern analogy in ontologies;
Patterns or regularities that are not ODP;
Ontologies tend to be repetitious and humans tend to be
bad at repetition – tedium kicks in….
Calls for automation
31. Ontology Pre-Processor Language
?cell:CLASS,
?anatomyPart:CLASS,
?anatomy:CLASS =
(CL:0000000 part_of some ?anatomyPart)
BEGIN
ADD ?cell equivalentTo ?anatomy
END;
A cell type is equivalent to a cell type
that is part of some anatomy
Pattern
OPPL Script
Variable mapper ?cell -> ‘Kidney Cell’[CL:0003523]
?anatomyPart -> ‘Kidney’[FMA:629093]
32. Resulting OWL axioms
Class: CL:0003523
Annotation:
rdfs:label ‘Kidney Cell’
EquivalentTo:
CL:0000000 and OBO_REL:part_of some FMA:629093
A ‘Kidney Cell’ is equivalent to a cell
that is part of the ‘Kidney’
Example
Generated OWL (Manchester Syntax)
33. Automation
Moving from hand-crafting to production line
Can try things out and then re-model (as long as the
entities involved don’t change)
Documents what has been done;
Ruthlessly consistent;
Also need support in repetitious knowledge gathering as
well as axiom generation.
34. Populous
Generic tool for populating ontology templates
Spreadsheet style interface
Supports validation at the point of data entry
Expressive Pattern language for OWL Ontology generation
http://www.e-lico.eu/populous
35. Evaluation
A big “can of worms”
Closely linked to requirements
Closely linked to what one believes an ontology to be…;
“Just do what I say and it will be OK” isn’t an evaluation
strategy;
Nor is saying “just model reality” and that’s all you need
to evaluate;
No really convincing way of doing it.
36. The Role of philosophy
06/27/14
38
Biology
Computer Science
Philosophy
39. Can we have Ontology Engineering?
Probably, but you’ll have to wait;
Not much predictability, except to say “it’s hard” and “people wil
disagree with you”
So, much like software engineering;
Much to learn from SE and it should be quicker;
Programming is not software engineering
Axiom authoring is not ontology engineering;
At the moment we’re writing axioms, but realise we need to
engineer;
Once wwe can demonstrate, with predictability, that two independent
groups can take a method and each produce an ontology that
meets some needs then I’ll begin to relax.