2. Overview of this talk
• Semantic Web: the digital heritage case
• Knowledge-engineering principles
• Challenges for Web KE
3. My journey
knowledge engineering
• design patterns for
problem solving
• methodology for
knowledge systems
• models of domain
knowledge
• ontology
engineering
14. The myth of a unified vocabulary
• In large virtual collections there are always multiple
vocabularies
– In multiple languages
• Every vocabulary has its own perspective
– You can’t just merge them
• But you can use vocabularies jointly by defining a
limited set of links
– “Vocabulary alignment”
• It is surprising what you can do with just a few links
15. Example use of vocabulary
alignment
“Tokugawa”
SVCN period
Edo
SVCN is local in-house
ethnology thesaurus
AAT style/period
Edo (Japanese period)
Tokugawa
AAT is Getty’s
Art & Architecture Thesaurus
24. Principle 1: Be modest!
• Ontology engineers should refrain from
developing their own idiosyncratic
ontologies
• Instead, they should make the available
rich vocabularies, thesauri and
databases available in an interoperable
(web) format
• Initially, only add the originally intended
semantics
25. Principle 2: Think large!
"Once you have a truly massive amount of
information integrated as knowledge, then the
human-software system will be superhuman, in
the same sense that mankind with writing is
superhuman compared to mankind before
writing."
Doug Lenat
26. Principle 3: Develop and use
patterns!
• Don’t try to be (too) creative
• Ontology engineering should not be an
art but a discipline
• Patterns play a key role in methodology
for ontology engineering
• See for example patterns developed by
the W3C Semantic Web Best Practices
group
http://www.w3.org/2001/sw/BestPractices/
27. Principle 4: Don’t recreate, but
enrich and align
• Techniques:
– Learning ontology relations/mappings
– Semantic analysis, e.g. OntoClean
– Processing of scope notes in thesauri
29. Principle 6: writing in an ontology
language doesn’t make it an ontology!
• Ontology is vehicle for sharing
• Papers about your own idiosyncratic
“university ontology” should be rejected
at conferences
• The quality of an ontology does not
depend on the number of, for example,
OWL constructs used
30. Principle 7: Required level of formal
semantics depends on the domain!
• In our semantic search we use three
OWL constructs:
– owl:sameAs, owl:TransitiveProperty,
owl:SymmetricProperty
• But cultural heritage has is very different
from medicine and bioinformatics
– Don’t over-generalize on requirements for
e.g. OWL
35. Challenge: vocabulary
alignment methodology
• Multitude of alignment techniques
available
– Direct syntactic match
– Lexical manipulation
– Structured, ….
• Precision & recall varies
• Large evaluation initiative
– OAEI http://oaei.ontologymatching.org/
36. Limitations of categorical
thinking
• The set theory on which ontology languages are
built is inadequate for modelling how people
think about categories (Lakoff)
– Category boundaries are not hard: cf. art styles
– People think of prototypes; some examples are
very prototypical, others less
• We also need to make meta-distinctions explicit
– organizing class: “furniture”
– base-level class: “chair”
– domain-specific: “Windsor chair”
41. Challenge: data trust issues
• How can a museum trust annotations of
outsiders?
• Need to adapt techniques from closed
world to open world
• Ongoing case studies study reputation
assessment, use of probability theories,
….
47. We need to study the Web as a
phenomenon
• Web dynamics
• Collective intelligence
• Privacy, trust and
security
• Linked open data
• Universal access
49. Acknowledgements
• Long list of people
• Projects: MIA, MultiemdiaN E-Culture,
CHOICE, MunCH, CHIP, Agora,
PrestoPrime, NoTube,
EuropeanaConnect, Poseidon