SlideShare une entreprise Scribd logo
1  sur  54
Télécharger pour lire hors ligne
Knowledge engineering
and the Web
Guus Schreiber
VU University Amsterdam
Computer Science, Network Institute
Overview of this talk
•  Web data representation
– a meta view
•  Knowledge for the Web: categories
– key sources, alignment, conversion
•  Using knowledge: visualization and
search
My journey
knowledge engineering
•  design patterns for
problem solving
•  methodology for
knowledge systems
•  models of domain
knowledge
•  ontology
engineering
My journey
access to digital heritage
My journey
Web standards
Chair of
•  RDF 1.1
•  OWL Web Ontology Language 1.0
•  SKOS model for publishing vocabularies
on the Web
•  Deployment & best practices
A few words about
Web standardization
•  Key success factor!
•  Consensus process actually works
– Some of the time at least
•  Public review
– Taking every comment seriously
•  The danger of over-designing
– Principle of minimal commitment
Example: W3C RDF 1.1 group
•  8K group messages (publicly visible)
•  2K messages about external comments
•  125+ teleconferences
•  200 issues resolved
WEB DATA REPRESENTATION
Caution
•  Representation languages are there for
you
•  And not the other way around ….
HTML5: a leap forward
Rationale
•  Consistent separation
of content and
presentation
•  Semantics of the
structure of
information
Typical new elements
<article>!
<section>!
<aside>!
<header>
<footer>!
RDF: triples and graphs
RDF is simply labeling resources and links
RDF: multiple graphs
www.example.org/bob
RDF syntaxes
•  Human-readable: Turtle/TriG
•  Line-based: N-Triples/N-Quads
•  HTML embedding: RDFa
•  JSON-based: JSON-LD
•  XML-based: RDF XML
Data modeling on the Web
RDF
•  Class hierarchy
•  Property hierarchy
•  Domain and range
restrictions
•  Data types
•  OWL
•  Property characteristics
–  E.g., inverse, functional,
transitive, …..
•  Identify management
–  E.g., same as, equivalent
class
•  ……..
I prefer a pick-and-choose approach
Writing in an ontology language
does not make it an ontology!
•  Ontology is vehicle for sharing
•  Papers about your own idiosyncratic
“university ontology” should be rejected
at conferences
•  The quality of an ontology does not
depend on the number of, for example,
OWL constructs used
Rationale
•  A vocabulary represents
distilled knowledge of a
community
•  Typically product of a
consensus process over
longer period of time
Use
•  200+ vocabularies
published
•  E.g.: Library of
Congress Subject
Headings
•  Mainly in library field
SKOS: making existing
vocabularies Web accessible
The strength of SKOS lies
in its simplicity
Baker et al: Key choices
in the design of SKOS
Beware of ontological over-
commitment
•  We have the understandable tendency
to use semantic modeling constructs
whenever we can
•  Better is to limit any Web model to the
absolute minimum
KNOWLEDGE ON THE WEB:
CATEGORIES
The	
  concept	
  triad
	
  
Source: http://www.jamesodell.com/Ontology_White_Paper_2011-07-15.pdf.
Categorization
•  OWL (Description logic) takes an
extensional view of classes
– A set is completely defined by its members
•  This puts the emphasis on specifying
class boundaries
•  Work of Rosch et al. takes a different
view
21
Categories (Rosch)
•  Help us to organize the world
•  Tools for perception
•  Basic-level categories
– Are the prime categories used by people
– Have the highest number of common and
distinctive attributes
– What those basic-level categories are may
depend on context
22
Basic-level categories
23
24
FOAF: Friend of a Friend
25
Dublin Core: metadata of Web resources
Iconclass
categorizing image scene
schema.org
categories for TV programs
schema.org issues
•  Top-down versus bottom-up
•  Ownership and control
•  Who can update/extend?
•  Does use for general search bias the
vocabulary?
The myth of a unified
vocabulary
•  In large virtual collections there are always
multiple vocabularies
–  In multiple languages
•  Every vocabulary has its own perspective
–  You can’t just merge them
•  But you can use vocabularies jointly by
defining a limited set of links
–  “Vocabulary alignment”
Category alignment vs.
identity disambiguation
•  Alignment concerns finding links
between (similar) categories, which
typically have no identity in the real world
•  Identity disambiguation is finding out
whether two or more IDs point to the
same object in the real world (e.g.,
person, building, ship)
•  The distinction is more subtle that “class
versus instance”
Alignment techniques
•  Syntax: comparison of characters of the terms
–  Measures of syntactic distance
–  Language processing
•  E.g. Tokenization, single/plural,
•  Relate to lexical resource
–  Relate terms to place in WordNet hierarchy
•  Taxonomy comparison
–  Look for common parents/children in taxonomy
•  Instance based mapping
–  Two classes are similar if their instances are similar.
Alignment evaluation
Limitations of categorical
thinking
Be modest! Don’t recreate,
but enrich and align
•  Knowledge engineers should refrain
from developing their own idiosyncratic
ontologies
•  Instead, they should make the available
rich vocabularies, thesauri and
databases available in an interoperable
(web) format
•  Techniques: learning, alignment
35	
  
Issues	
  in	
  conversion	
  to	
  Linked	
  Data	
  
•  Keep	
  original	
  info	
  as	
  much	
  as	
  possible	
  
•  Ensure	
  standard	
  datatypes	
  for	
  value	
  
conversion	
  
•  Choice	
  of	
  concept	
  URL	
  
•  Vocabulary	
  URL:	
  our	
  strategy	
  
–  Large	
  organizaCons:	
  use	
  their	
  own	
  
–  Small	
  organizaCons:	
  purl.org	
  URL	
  	
  
36	
  
Winner EuroITV Competition
Best Archives on the Web Award
http://waisda.nl
38	
  
http://waisda.nl
http://www.prestoprime.org/
@waisda
Improve video search for (1) fragment
retrieval & (2) within video navigation
by using crowdsourcing for (1) including
time-based annotations & (2) bridging the
vocabulary gap of searcher & cataloguer
Nichesourcing:	
  finding	
  the	
  right	
  
minority	
  vote	
  
“the	
  frog	
  stands	
  
for	
  the	
  rejected	
  
lover”	
  
“The	
  frog	
  indicates	
  the	
  	
  
profession	
  of	
  the	
  women”	
  
“Sexy	
  ladies!”	
  
annota;on	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  quality	
  level	
  
40	
  
“women”	
  
http://sealincmedia.nl
USING KNOWLEDGE:
VISUALIZATION AND SEARCH
Mobile museum tour
Visualising piracy events
Extracting piracy events
from piracy reports & Web sources
Enriching description of
search results
Sample graph search algorithm
From search term (literal) to art work
•  Find resources with matching label
•  Find path from resource to art work
– Cost of each step (step when above cost
threshold)
– Special treatment of semantics: sameAs,
inverseOf, transitive, ….
•  Cluster results based on path similarities
Graph search
Example of path clustering
Issues:
•  number of clusters
•  path length
Using	
  alignment	
  in	
  search	
  
“Tokugawa”
SVCN period
Edo
SVCN is local in-house
ethnology thesaurus
AAT style/period
Edo (Japanese period)
Tokugawa
AAT is Getty’s
Art & Architecture Thesaurus
Loca;on-­‐based	
  search:	
  
Moulin	
  de	
  la	
  GaleCe	
  
relatively easy
Relation search:
Picasso, Matisse & Braque
Acknowledgements
•  Long list of people
•  Projects: COMMIT, Agora, PrestoPrime,
EuropeanaConnect, Poseidon,
BiographyNet, Multimedian E-Culture

Contenu connexe

Tendances

Principles for knowledge engineering on the Web
Principles for knowledge engineering on the WebPrinciples for knowledge engineering on the Web
Principles for knowledge engineering on the WebGuus Schreiber
 
Mdst3703 2013-10-01-hypertext-and-history
Mdst3703 2013-10-01-hypertext-and-historyMdst3703 2013-10-01-hypertext-and-history
Mdst3703 2013-10-01-hypertext-and-historyRafael Alvarado
 
Mdst3703 2013-09-24-hypertext
Mdst3703 2013-09-24-hypertextMdst3703 2013-09-24-hypertext
Mdst3703 2013-09-24-hypertextRafael Alvarado
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18Rafael Alvarado
 
Mdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collectionsMdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collectionsRafael Alvarado
 
Comm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcockComm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcockMelanie Parlette-Stewart
 
The more things change, the more they stay the same...”: Why digital journals...
The more things change, the more they stay the same...”: Why digital journals...The more things change, the more they stay the same...”: Why digital journals...
The more things change, the more they stay the same...”: Why digital journals...Pratt_Symposium
 
Recommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoRecommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoAshok Venkatesan
 
Near Real-time Web-Page Recs Using Content Features
Near Real-time Web-Page Recs Using Content FeaturesNear Real-time Web-Page Recs Using Content Features
Near Real-time Web-Page Recs Using Content FeaturesAshok Venkatesan
 
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Alannah Fitzgerald
 
Plagiarism in the Academy
Plagiarism in the AcademyPlagiarism in the Academy
Plagiarism in the AcademyCrossref
 
The network reconfigures the catalog
The network reconfigures the catalogThe network reconfigures the catalog
The network reconfigures the cataloglisld
 
Collection directions - towards collective collections
Collection directions - towards collective collectionsCollection directions - towards collective collections
Collection directions - towards collective collectionslisld
 

Tendances (13)

Principles for knowledge engineering on the Web
Principles for knowledge engineering on the WebPrinciples for knowledge engineering on the Web
Principles for knowledge engineering on the Web
 
Mdst3703 2013-10-01-hypertext-and-history
Mdst3703 2013-10-01-hypertext-and-historyMdst3703 2013-10-01-hypertext-and-history
Mdst3703 2013-10-01-hypertext-and-history
 
Mdst3703 2013-09-24-hypertext
Mdst3703 2013-09-24-hypertextMdst3703 2013-09-24-hypertext
Mdst3703 2013-09-24-hypertext
 
UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18UVA MDST 3703 Thematic Research Collections 2012-09-18
UVA MDST 3703 Thematic Research Collections 2012-09-18
 
Mdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collectionsMdst3703 2013-10-08-thematic-research-collections
Mdst3703 2013-10-08-thematic-research-collections
 
Comm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcockComm 1130 technical_communication_march2012-alcock
Comm 1130 technical_communication_march2012-alcock
 
The more things change, the more they stay the same...”: Why digital journals...
The more things change, the more they stay the same...”: Why digital journals...The more things change, the more they stay the same...”: Why digital journals...
The more things change, the more they stay the same...”: Why digital journals...
 
Recommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoRecommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and Dato
 
Near Real-time Web-Page Recs Using Content Features
Near Real-time Web-Page Recs Using Content FeaturesNear Real-time Web-Page Recs Using Content Features
Near Real-time Web-Page Recs Using Content Features
 
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
 
Plagiarism in the Academy
Plagiarism in the AcademyPlagiarism in the Academy
Plagiarism in the Academy
 
The network reconfigures the catalog
The network reconfigures the catalogThe network reconfigures the catalog
The network reconfigures the catalog
 
Collection directions - towards collective collections
Collection directions - towards collective collectionsCollection directions - towards collective collections
Collection directions - towards collective collections
 

En vedette

ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...eswcsummerschool
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02eswcsummerschool
 
Mon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked dataMon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked dataeswcsummerschool
 
Semantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student projectSemantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student projecteswcsummerschool
 
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student projectArabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student projecteswcsummerschool
 
Syrtaki - ESWC SSchool 14 - Student project
Syrtaki  - ESWC SSchool 14 - Student projectSyrtaki  - ESWC SSchool 14 - Student project
Syrtaki - ESWC SSchool 14 - Student projecteswcsummerschool
 
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data TutorialESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorialeswcsummerschool
 
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked dataESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked dataeswcsummerschool
 

En vedette (8)

ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
ESWC SS 2013 - Tuesday Tutorial 1 Maribel Acosta and Barry Norton: Providing ...
 
Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02Mon norton tut_queryinglinkeddata02
Mon norton tut_queryinglinkeddata02
 
Mon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked dataMon fundulaki tut_querying linked data
Mon fundulaki tut_querying linked data
 
Semantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student projectSemantic Aquarium - ESWC SSchool 14 - Student project
Semantic Aquarium - ESWC SSchool 14 - Student project
 
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student projectArabic Sentiment Lexicon - ESWC SSchool 14 - Student project
Arabic Sentiment Lexicon - ESWC SSchool 14 - Student project
 
Syrtaki - ESWC SSchool 14 - Student project
Syrtaki  - ESWC SSchool 14 - Student projectSyrtaki  - ESWC SSchool 14 - Student project
Syrtaki - ESWC SSchool 14 - Student project
 
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data TutorialESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
ESWC SS 2012 - Friday Keynote Marko Grobelnik: Big Data Tutorial
 
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked dataESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
ESWC SS 2013 - Thursday Keynote Vassilis Christophides: Preserving linked data
 

Similaire à Fri schreiber key_knowledge engineering

EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...Keith.May
 
Ontology Engineering: Introduction
Ontology Engineering: IntroductionOntology Engineering: Introduction
Ontology Engineering: IntroductionGuus Schreiber
 
Principles and pragmatics of a Semantic Culture Web
 Principles and pragmatics of a Semantic Culture Web Principles and pragmatics of a Semantic Culture Web
Principles and pragmatics of a Semantic Culture WebGuus Schreiber
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
 
Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.Janet Leu
 
TSS 2017: Terminology and Knowledge Organization Systems
TSS 2017: Terminology and Knowledge Organization SystemsTSS 2017: Terminology and Knowledge Organization Systems
TSS 2017: Terminology and Knowledge Organization SystemsMichael Wetzel
 
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
 Tutorial: Building and using ontologies -  E.Simperl - ESWC SS 2014 Tutorial: Building and using ontologies -  E.Simperl - ESWC SS 2014
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014eswcsummerschool
 
Building and using ontologies
Building and using ontologies Building and using ontologies
Building and using ontologies Elena Simperl
 
Web 3 final(1)
Web 3 final(1)Web 3 final(1)
Web 3 final(1)Venky Dood
 
Innovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPInnovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPariadnenetwork
 
Porting Library Vocabularies to the Semantic Web - IFLA 2010
Porting Library Vocabularies to the Semantic Web - IFLA 2010Porting Library Vocabularies to the Semantic Web - IFLA 2010
Porting Library Vocabularies to the Semantic Web - IFLA 2010Bernard Vatant
 
Leveraging Exhibitions as a Needs-Based Skill Development Program in Librarie...
Leveraging Exhibitions as a Needs-Based Skill Development Program in Librarie...Leveraging Exhibitions as a Needs-Based Skill Development Program in Librarie...
Leveraging Exhibitions as a Needs-Based Skill Development Program in Librarie...Sara Sterkenburg
 
OER for repository managers
OER for repository managersOER for repository managers
OER for repository managersNick Sheppard
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
 
LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch a...
LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch a...LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch a...
LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch a...locloud
 
CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...
CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...
CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...Keith.May
 
Reaching the researcher
Reaching the researcherReaching the researcher
Reaching the researcherLIBER Europe
 

Similaire à Fri schreiber key_knowledge engineering (20)

Ontologies Fmi 042010
Ontologies Fmi 042010Ontologies Fmi 042010
Ontologies Fmi 042010
 
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
EAA2014 Istanbul - Barriers and Opportunities for Linked Open Data use in Arc...
 
Ontology Engineering: Introduction
Ontology Engineering: IntroductionOntology Engineering: Introduction
Ontology Engineering: Introduction
 
Principles and pragmatics of a Semantic Culture Web
 Principles and pragmatics of a Semantic Culture Web Principles and pragmatics of a Semantic Culture Web
Principles and pragmatics of a Semantic Culture Web
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.Taxonomy, ontology, folksonomies & SKOS.
Taxonomy, ontology, folksonomies & SKOS.
 
TSS 2017: Terminology and Knowledge Organization Systems
TSS 2017: Terminology and Knowledge Organization SystemsTSS 2017: Terminology and Knowledge Organization Systems
TSS 2017: Terminology and Knowledge Organization Systems
 
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
 Tutorial: Building and using ontologies -  E.Simperl - ESWC SS 2014 Tutorial: Building and using ontologies -  E.Simperl - ESWC SS 2014
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
 
Building and using ontologies
Building and using ontologies Building and using ontologies
Building and using ontologies
 
Web 3 final(1)
Web 3 final(1)Web 3 final(1)
Web 3 final(1)
 
Innovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLPInnovative methods for data integration: Linked Data and NLP
Innovative methods for data integration: Linked Data and NLP
 
Porting Library Vocabularies to the Semantic Web - IFLA 2010
Porting Library Vocabularies to the Semantic Web - IFLA 2010Porting Library Vocabularies to the Semantic Web - IFLA 2010
Porting Library Vocabularies to the Semantic Web - IFLA 2010
 
EDS for JIBS
EDS for JIBSEDS for JIBS
EDS for JIBS
 
Leveraging Exhibitions as a Needs-Based Skill Development Program in Librarie...
Leveraging Exhibitions as a Needs-Based Skill Development Program in Librarie...Leveraging Exhibitions as a Needs-Based Skill Development Program in Librarie...
Leveraging Exhibitions as a Needs-Based Skill Development Program in Librarie...
 
OER for repository managers
OER for repository managersOER for repository managers
OER for repository managers
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch a...
LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch a...LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch a...
LoCloud Vocabulary Services: Thesaurus management introduction, Walter Koch a...
 
From ontology to wiki
From ontology to wikiFrom ontology to wiki
From ontology to wiki
 
CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...
CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...
CAA 2014 - To Boldly or Bravely Go? Experiences of using Semantic Technologie...
 
Reaching the researcher
Reaching the researcherReaching the researcher
Reaching the researcher
 

Plus de eswcsummerschool

Keep fit (a bit) - ESWC SSchool 14 - Student project
Keep fit (a bit)  - ESWC SSchool 14 - Student projectKeep fit (a bit)  - ESWC SSchool 14 - Student project
Keep fit (a bit) - ESWC SSchool 14 - Student projecteswcsummerschool
 
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student projectFIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student projecteswcsummerschool
 
Personal Tours at the British Museum - ESWC SSchool 14 - Student project
Personal Tours at the British Museum  - ESWC SSchool 14 - Student projectPersonal Tours at the British Museum  - ESWC SSchool 14 - Student project
Personal Tours at the British Museum - ESWC SSchool 14 - Student projecteswcsummerschool
 
Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...eswcsummerschool
 
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...eswcsummerschool
 
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014 Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014 eswcsummerschool
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014eswcsummerschool
 
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014 Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014 eswcsummerschool
 
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...eswcsummerschool
 
Mon norton tut_publishing01
Mon norton tut_publishing01Mon norton tut_publishing01
Mon norton tut_publishing01eswcsummerschool
 
Mon domingue introduction to the school
Mon domingue introduction to the schoolMon domingue introduction to the school
Mon domingue introduction to the schooleswcsummerschool
 
Mon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage dataMon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage dataeswcsummerschool
 
Tue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddataTue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddataeswcsummerschool
 
Thu bernstein key_warp_speed
Thu bernstein key_warp_speedThu bernstein key_warp_speed
Thu bernstein key_warp_speedeswcsummerschool
 
Mon domingue key_introduction to semantic
Mon domingue key_introduction to semanticMon domingue key_introduction to semantic
Mon domingue key_introduction to semanticeswcsummerschool
 
Tue acosta tut_providing_linkeddata
Tue acosta tut_providing_linkeddataTue acosta tut_providing_linkeddata
Tue acosta tut_providing_linkeddataeswcsummerschool
 
Wed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservationsWed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservationseswcsummerschool
 
Wed garcia hands_on_d_bpedia preservation
Wed garcia hands_on_d_bpedia preservationWed garcia hands_on_d_bpedia preservation
Wed garcia hands_on_d_bpedia preservationeswcsummerschool
 
Wed van horik_handson_research data management
Wed van horik_handson_research data managementWed van horik_handson_research data management
Wed van horik_handson_research data managementeswcsummerschool
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapubeswcsummerschool
 

Plus de eswcsummerschool (20)

Keep fit (a bit) - ESWC SSchool 14 - Student project
Keep fit (a bit)  - ESWC SSchool 14 - Student projectKeep fit (a bit)  - ESWC SSchool 14 - Student project
Keep fit (a bit) - ESWC SSchool 14 - Student project
 
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student projectFIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
FIT-8BIT An activity music assistant - ESWC SSchool 14 - Student project
 
Personal Tours at the British Museum - ESWC SSchool 14 - Student project
Personal Tours at the British Museum  - ESWC SSchool 14 - Student projectPersonal Tours at the British Museum  - ESWC SSchool 14 - Student project
Personal Tours at the British Museum - ESWC SSchool 14 - Student project
 
Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...Exhibition recommendation using British Museum data and Event Registry - ESWC...
Exhibition recommendation using British Museum data and Event Registry - ESWC...
 
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
Empowering fishing business using Linked Data - ESWC SSchool 14 - Student pro...
 
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014 Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
Tutorial: Social Semantic Web and Crowdsourcing - E. Simperl - ESWC SS 2014
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
 
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014 Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
Hands On: Amazon Mechanical Turk - M. Acosta - ESWC SS 2014
 
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
Tutorial: Querying a Marine Data Warehouse Using SPARQL - I. Fundulaki - ESWC...
 
Mon norton tut_publishing01
Mon norton tut_publishing01Mon norton tut_publishing01
Mon norton tut_publishing01
 
Mon domingue introduction to the school
Mon domingue introduction to the schoolMon domingue introduction to the school
Mon domingue introduction to the school
 
Mon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage dataMon norton tut_querying cultural heritage data
Mon norton tut_querying cultural heritage data
 
Tue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddataTue acosta hands_on_providinglinkeddata
Tue acosta hands_on_providinglinkeddata
 
Thu bernstein key_warp_speed
Thu bernstein key_warp_speedThu bernstein key_warp_speed
Thu bernstein key_warp_speed
 
Mon domingue key_introduction to semantic
Mon domingue key_introduction to semanticMon domingue key_introduction to semantic
Mon domingue key_introduction to semantic
 
Tue acosta tut_providing_linkeddata
Tue acosta tut_providing_linkeddataTue acosta tut_providing_linkeddata
Tue acosta tut_providing_linkeddata
 
Wed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservationsWed batsakis tut_challenges of preservations
Wed batsakis tut_challenges of preservations
 
Wed garcia hands_on_d_bpedia preservation
Wed garcia hands_on_d_bpedia preservationWed garcia hands_on_d_bpedia preservation
Wed garcia hands_on_d_bpedia preservation
 
Wed van horik_handson_research data management
Wed van horik_handson_research data managementWed van horik_handson_research data management
Wed van horik_handson_research data management
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 

Fri schreiber key_knowledge engineering

  • 1. Knowledge engineering and the Web Guus Schreiber VU University Amsterdam Computer Science, Network Institute
  • 2. Overview of this talk •  Web data representation – a meta view •  Knowledge for the Web: categories – key sources, alignment, conversion •  Using knowledge: visualization and search
  • 3. My journey knowledge engineering •  design patterns for problem solving •  methodology for knowledge systems •  models of domain knowledge •  ontology engineering
  • 4. My journey access to digital heritage
  • 5. My journey Web standards Chair of •  RDF 1.1 •  OWL Web Ontology Language 1.0 •  SKOS model for publishing vocabularies on the Web •  Deployment & best practices
  • 6. A few words about Web standardization •  Key success factor! •  Consensus process actually works – Some of the time at least •  Public review – Taking every comment seriously •  The danger of over-designing – Principle of minimal commitment
  • 7. Example: W3C RDF 1.1 group •  8K group messages (publicly visible) •  2K messages about external comments •  125+ teleconferences •  200 issues resolved
  • 9. Caution •  Representation languages are there for you •  And not the other way around ….
  • 10. HTML5: a leap forward Rationale •  Consistent separation of content and presentation •  Semantics of the structure of information Typical new elements <article>! <section>! <aside>! <header> <footer>!
  • 11. RDF: triples and graphs RDF is simply labeling resources and links
  • 13. RDF syntaxes •  Human-readable: Turtle/TriG •  Line-based: N-Triples/N-Quads •  HTML embedding: RDFa •  JSON-based: JSON-LD •  XML-based: RDF XML
  • 14. Data modeling on the Web RDF •  Class hierarchy •  Property hierarchy •  Domain and range restrictions •  Data types •  OWL •  Property characteristics –  E.g., inverse, functional, transitive, ….. •  Identify management –  E.g., same as, equivalent class •  …….. I prefer a pick-and-choose approach
  • 15. Writing in an ontology language does not make it an ontology! •  Ontology is vehicle for sharing •  Papers about your own idiosyncratic “university ontology” should be rejected at conferences •  The quality of an ontology does not depend on the number of, for example, OWL constructs used
  • 16. Rationale •  A vocabulary represents distilled knowledge of a community •  Typically product of a consensus process over longer period of time Use •  200+ vocabularies published •  E.g.: Library of Congress Subject Headings •  Mainly in library field SKOS: making existing vocabularies Web accessible
  • 17. The strength of SKOS lies in its simplicity Baker et al: Key choices in the design of SKOS
  • 18. Beware of ontological over- commitment •  We have the understandable tendency to use semantic modeling constructs whenever we can •  Better is to limit any Web model to the absolute minimum
  • 19. KNOWLEDGE ON THE WEB: CATEGORIES
  • 20. The  concept  triad   Source: http://www.jamesodell.com/Ontology_White_Paper_2011-07-15.pdf.
  • 21. Categorization •  OWL (Description logic) takes an extensional view of classes – A set is completely defined by its members •  This puts the emphasis on specifying class boundaries •  Work of Rosch et al. takes a different view 21
  • 22. Categories (Rosch) •  Help us to organize the world •  Tools for perception •  Basic-level categories – Are the prime categories used by people – Have the highest number of common and distinctive attributes – What those basic-level categories are may depend on context 22
  • 24. 24 FOAF: Friend of a Friend
  • 25. 25 Dublin Core: metadata of Web resources
  • 28. schema.org issues •  Top-down versus bottom-up •  Ownership and control •  Who can update/extend? •  Does use for general search bias the vocabulary?
  • 29. The myth of a unified vocabulary •  In large virtual collections there are always multiple vocabularies –  In multiple languages •  Every vocabulary has its own perspective –  You can’t just merge them •  But you can use vocabularies jointly by defining a limited set of links –  “Vocabulary alignment”
  • 30. Category alignment vs. identity disambiguation •  Alignment concerns finding links between (similar) categories, which typically have no identity in the real world •  Identity disambiguation is finding out whether two or more IDs point to the same object in the real world (e.g., person, building, ship) •  The distinction is more subtle that “class versus instance”
  • 31. Alignment techniques •  Syntax: comparison of characters of the terms –  Measures of syntactic distance –  Language processing •  E.g. Tokenization, single/plural, •  Relate to lexical resource –  Relate terms to place in WordNet hierarchy •  Taxonomy comparison –  Look for common parents/children in taxonomy •  Instance based mapping –  Two classes are similar if their instances are similar.
  • 34. Be modest! Don’t recreate, but enrich and align •  Knowledge engineers should refrain from developing their own idiosyncratic ontologies •  Instead, they should make the available rich vocabularies, thesauri and databases available in an interoperable (web) format •  Techniques: learning, alignment
  • 35. 35  
  • 36. Issues  in  conversion  to  Linked  Data   •  Keep  original  info  as  much  as  possible   •  Ensure  standard  datatypes  for  value   conversion   •  Choice  of  concept  URL   •  Vocabulary  URL:  our  strategy   –  Large  organizaCons:  use  their  own   –  Small  organizaCons:  purl.org  URL     36  
  • 37.
  • 38. Winner EuroITV Competition Best Archives on the Web Award http://waisda.nl 38  
  • 39. http://waisda.nl http://www.prestoprime.org/ @waisda Improve video search for (1) fragment retrieval & (2) within video navigation by using crowdsourcing for (1) including time-based annotations & (2) bridging the vocabulary gap of searcher & cataloguer
  • 40. Nichesourcing:  finding  the  right   minority  vote   “the  frog  stands   for  the  rejected   lover”   “The  frog  indicates  the     profession  of  the  women”   “Sexy  ladies!”   annota;on                                        quality  level   40   “women”  
  • 45. Extracting piracy events from piracy reports & Web sources
  • 47. Sample graph search algorithm From search term (literal) to art work •  Find resources with matching label •  Find path from resource to art work – Cost of each step (step when above cost threshold) – Special treatment of semantics: sameAs, inverseOf, transitive, …. •  Cluster results based on path similarities
  • 49. Example of path clustering Issues: •  number of clusters •  path length
  • 50. Using  alignment  in  search   “Tokugawa” SVCN period Edo SVCN is local in-house ethnology thesaurus AAT style/period Edo (Japanese period) Tokugawa AAT is Getty’s Art & Architecture Thesaurus
  • 51. Loca;on-­‐based  search:   Moulin  de  la  GaleCe   relatively easy
  • 53.
  • 54. Acknowledgements •  Long list of people •  Projects: COMMIT, Agora, PrestoPrime, EuropeanaConnect, Poseidon, BiographyNet, Multimedian E-Culture