SlideShare une entreprise Scribd logo
1  sur  24
QOD 2019 – 2nd Workshop on Quality of Open Data
June 2019
Technical usability of Wikidata’s linked data:
evaluation of machine interoperability and data interpretability
Nuno Freire, Antoine Isac
Title here
CC BY-SA
Outline
CC BY-SA
● About Europeana
● Europeana and linked data
● Why investigate Wikidata and linked data for Europeana?
● Use cases from data applications in our study
● Study setup and system
● Results
● Conclusions, ongoing and future work
Title here
CC BY-SA
Europeana
The Platform for Europe’s Digital Cultural Heritage
● Aggregates and makes available data:
• From all EU countries
• From ~3,700 galleries, libraries,
archives and museums
• Under a CC0 licence
• More than 58M objects and
• In about 50 languages
CC BY-SA
We aggregate metadata:
• From all EU countries
• ~3,700 galleries, libraries,
archives and museums
• More than 58M objects
• In more than 40 languages
• High amount of references
to places, agents, concepts,
time periods
Title here
CC BY-SA
Europeana
The Platform for Europe’s Digital Cultural Heritage
CC BY-SA
Data aggregation
focused on metadata
… with cultural
objects as the main
entity
Title here
CC BY-SA
CC BY-SA
Europeana Linked Data Strategy
Our lines of work
● The Europeana Data Model (EDM) offers a base for linking
metadata
● We apply automatic enrichment to link object metadata to
reference datasets
● We encourage data providers to contribute their own links to
vocabularies
● We encourage alignment activities between domain
vocabularies
Title here
CC BY-SA
CC BY-SA
Why Wikidata?
Complies with all the Europeana’s criteria for selecting a vocabulary:
● Properly documented and supported by a community
● Technically available on the web according to the Linked Data best practices and
recipes
● Available under an open licence
● Multilingual (Wikidata offers labels in about 124 languages from which 48 match the languages
that Europeana supports)
● Apply the best practices and standards for the representation, structure and
description of vocabularies
● Well-connected internally and externally to other vocabularies (works great as a “pivot”
vocabulary)
Additionally…
● It gives fairly complete and accurate descriptive metadata about
entities
Currently, Wikidata is a data source for enrichment of metadata in
Europeana
✔
✔
✔
✔
✔
✔
Title here
CC BY-SA
Motivation for evaluating Wikidata
and linked data
● Wikidata can be a datasource of cultural heritage objects
• Increasing interest from cultural heritage institutions in sharing
descriptions of their digital objects
● We are investigating linked data for innovating the process of
aggregation of metadata:
• Aggregation of linked data has been the subject of a case study
• Schema.org is suitable for describing cultural heritage resources
CC BY-SA
Title here
CC BY-SA
Motivation for machine interoperability
and data interpretability
● Linked data sources of cultural data are numerous but data is
heterogeneous across them
● Effective and sustainable usage of these sources must be supported
by automatic means
• A minimum level of compliance with the Semantic Web is
necessary
CC BY-SA
Title here
CC BY-SA
Use cases of linked data
consumption addressed in this
study
CC BY-SA
Title here
CC BY-SA
Our study setup
(1/3)
CC BY-SA
Title here
CC BY-SA
Our
study
setup
(2/3)
CC BY-SA
Title here
CC BY-SA
Our
study
setup
(3/3)
CC BY-SA
Title here
CC BY-SA
CC BY-SA
The linked data system
Title here
CC BY-SA
Results
CC BY-SA
● Wikidata’s RDF presents some difficulties for cross-domain
applications
● Wikidata is using a very limited number of general data processing
properties
○ most of the properties in use are labels for human users
● Wikidata has chosen to use properties from its own ontology instead
of equivalent RDF, RDF-Schema, OWL or SKOS properties
○ Without human support, applications are unable to interpret
Wikidata’s properties
Title here
CC BY-SA
The other namespaces in use in
Wikidata’s RDF output
CC BY-SA
Occurrences in the 11.798 Wikidata resources of our sample
Title here
CC BY-SA
Results - general linked data
processing
CC BY-SA
● Wikidata makes limited use of rdf:type
○ It is used just to state that the RDF resource is an Item from the
Wikibase ontology (http://wikiba.se/ontology#Item)
○ For further types, the property wdt:P31 is used.
● Not all Wikidata RDF predicate URIs are resolvable
○ In the case of property wdt:P31, it is stated in the data as
http://www.wikidata.org/prop/direct/P31, which is not resolvable.
The resolvable corresponding URI is
http://www.wikidata.org/entity/P31
○ These unresolvable URIs limit machine’s interpretation of the
predicates
Title here
CC BY-SA
Results - general linked data
processing
CC BY-SA
● In order to proceed with the experiment, we manually added
alignment statements in our knowledge base
● In fact, most of the alignments are already recorded in Wikidata, but
they are expressed using predicates from Wikidata’s namespaces
○ … limiting the interpretation by machines
(The alignments are presented in a later slide)
Title here
CC BY-SA
Results - acquiring Schema.org
semantics from Wikidata
CC BY-SA
● Equivalence and specialisation relations between classes and properties are
used to infer (direct or infered) mappings between Wikidata and Schema.org
● We came across two obstacles.
○ For finding alignments, we faced again the non-resolvable URI’s
○ For navigating Wikidata’s class and property hierarchy, we had to
manually add alignment statements in our knowledge base
■ Wikidata data properties are used to state the class and property
hierarchy
● Adding additional alignments in our knowledge base was necessary
Title here
CC BY-SA
Alignments added for enabling
automatic data processing and
ontology reasoning
CC BY-SA
Title here
CC BY-SA
Results - acquiring Schema.org
semantics from Wikidata
CC BY-SA
● For classes, we found 102 distinct ones 57% of which had alignments to
Schema.org
○ 49% are direct alignments and 7,9% are alignments inherited from super
classes
● For properties, we found 266 distinct ones 44% of which had alignments to
Schema.org
○ only direct alignments were found for properties.
Title here
CC BY-SA
Results - acquiring Schema.org
semantics from Wikidata
CC BY-SA
These results are a good indicator that many applications would be able
to make use of the structured data.
The listing of the individual alignments found may be consulted online
https://github.com/nfreire/data-aggregation-lab/blob/master/data-aggregation-
casestudies/documentation/wikidata/SchemaOrg-ontology-alignments-listing.md
Title here
CC BY-SA
Conclusions
CC BY-SA
● Currently, a human operator must assist linked data applications to
interpret Wikidata’s RDF
○ it requires training of human resources on Wikidata’s data model and its
representation in RDF
○ The usage of predicates from Wikidata’s own ontology makes
uninterpretable for data crawlers based on of properties for general data
processing of the Semantic Web
● Another difficulty is the use of namespaces that are not resolvable for
Wikidata’s properties
● ...but Wikidata contains enough alignment data to RDF, RDFS, OWL,
SKOS and Schema.org:
Machine interpretation of Wikidata is just a few steps away
Title here
CC BY-SA
Ongoing and future work
CC BY-SA
● At this time, we are analyzing the results of evaluating the domain-
specific use case, which is using Wikidata data for input into
Europeana's cultural heritage metadata. Our first hints are that
Wikidata provides high enough quality for this case
● In future work, we expect to evaluate the linked data published by
data providers of Europeana in terms of machine processing
Thank you for your attention
nuno.freire@tecnico.ulisboa.pt
Netherlands, Public Domain
1660 - 1625, Rijksmuseum
Anonymous
Arrival of a Portuguese ship
Acknowledgments
Fundação para a Ciência e a Tecnologia (FCT): UID/CEC/50021/2013
European Commission contract number 30-CE-0885387/00-80.

Contenu connexe

Plus de Nuno Freire

Aggregation of cultural heritage datasets through the Web of Data
Aggregation of cultural heritage datasets through the Web of DataAggregation of cultural heritage datasets through the Web of Data
Aggregation of cultural heritage datasets through the Web of DataNuno Freire
 
Evaluation of Schema.org for Aggregation of Cultural Heritage Metadata
Evaluation of Schema.org for Aggregation of Cultural Heritage MetadataEvaluation of Schema.org for Aggregation of Cultural Heritage Metadata
Evaluation of Schema.org for Aggregation of Cultural Heritage MetadataNuno Freire
 
The Europeana Community: Semantics and Cultural Heritage Data
The Europeana Community: Semantics and Cultural Heritage DataThe Europeana Community: Semantics and Cultural Heritage Data
The Europeana Community: Semantics and Cultural Heritage DataNuno Freire
 
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...Nuno Freire
 
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...Nuno Freire
 
IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017Nuno Freire
 
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...
New approaches for data acquisition at europeana  iiif, sitemaps and schema.o...New approaches for data acquisition at europeana  iiif, sitemaps and schema.o...
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...Nuno Freire
 
Use Cases From Digital Humanities for Library Linked Data
Use Cases From Digital Humanities for Library Linked DataUse Cases From Digital Humanities for Library Linked Data
Use Cases From Digital Humanities for Library Linked DataNuno Freire
 

Plus de Nuno Freire (8)

Aggregation of cultural heritage datasets through the Web of Data
Aggregation of cultural heritage datasets through the Web of DataAggregation of cultural heritage datasets through the Web of Data
Aggregation of cultural heritage datasets through the Web of Data
 
Evaluation of Schema.org for Aggregation of Cultural Heritage Metadata
Evaluation of Schema.org for Aggregation of Cultural Heritage MetadataEvaluation of Schema.org for Aggregation of Cultural Heritage Metadata
Evaluation of Schema.org for Aggregation of Cultural Heritage Metadata
 
The Europeana Community: Semantics and Cultural Heritage Data
The Europeana Community: Semantics and Cultural Heritage DataThe Europeana Community: Semantics and Cultural Heritage Data
The Europeana Community: Semantics and Cultural Heritage Data
 
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
Building new knowledge from distributed scientific corpus: HERBADROP & EUROPE...
 
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
Metadata Aggregation: Assessing the Application of IIIF and Sitemaps within C...
 
IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017IIIF at europeana, IIIF conference, Vatican, 2017
IIIF at europeana, IIIF conference, Vatican, 2017
 
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...
New approaches for data acquisition at europeana  iiif, sitemaps and schema.o...New approaches for data acquisition at europeana  iiif, sitemaps and schema.o...
New approaches for data acquisition at europeana iiif, sitemaps and schema.o...
 
Use Cases From Digital Humanities for Library Linked Data
Use Cases From Digital Humanities for Library Linked DataUse Cases From Digital Humanities for Library Linked Data
Use Cases From Digital Humanities for Library Linked Data
 

Dernier

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar TrainingKylaCullinane
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lodhisaajjda
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCamilleBoulbin1
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatmentnswingard
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoKayode Fayemi
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIINhPhngng3
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsaqsarehman5055
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Delhi Call girls
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Baileyhlharris
 

Dernier (20)

BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 

Technical usability of Wikidata’s linked data: evaluation of machine interoperability and data interpretability

  • 1. QOD 2019 – 2nd Workshop on Quality of Open Data June 2019 Technical usability of Wikidata’s linked data: evaluation of machine interoperability and data interpretability Nuno Freire, Antoine Isac
  • 2. Title here CC BY-SA Outline CC BY-SA ● About Europeana ● Europeana and linked data ● Why investigate Wikidata and linked data for Europeana? ● Use cases from data applications in our study ● Study setup and system ● Results ● Conclusions, ongoing and future work
  • 3. Title here CC BY-SA Europeana The Platform for Europe’s Digital Cultural Heritage ● Aggregates and makes available data: • From all EU countries • From ~3,700 galleries, libraries, archives and museums • Under a CC0 licence • More than 58M objects and • In about 50 languages CC BY-SA We aggregate metadata: • From all EU countries • ~3,700 galleries, libraries, archives and museums • More than 58M objects • In more than 40 languages • High amount of references to places, agents, concepts, time periods
  • 4. Title here CC BY-SA Europeana The Platform for Europe’s Digital Cultural Heritage CC BY-SA Data aggregation focused on metadata … with cultural objects as the main entity
  • 5. Title here CC BY-SA CC BY-SA Europeana Linked Data Strategy Our lines of work ● The Europeana Data Model (EDM) offers a base for linking metadata ● We apply automatic enrichment to link object metadata to reference datasets ● We encourage data providers to contribute their own links to vocabularies ● We encourage alignment activities between domain vocabularies
  • 6. Title here CC BY-SA CC BY-SA Why Wikidata? Complies with all the Europeana’s criteria for selecting a vocabulary: ● Properly documented and supported by a community ● Technically available on the web according to the Linked Data best practices and recipes ● Available under an open licence ● Multilingual (Wikidata offers labels in about 124 languages from which 48 match the languages that Europeana supports) ● Apply the best practices and standards for the representation, structure and description of vocabularies ● Well-connected internally and externally to other vocabularies (works great as a “pivot” vocabulary) Additionally… ● It gives fairly complete and accurate descriptive metadata about entities Currently, Wikidata is a data source for enrichment of metadata in Europeana ✔ ✔ ✔ ✔ ✔ ✔
  • 7. Title here CC BY-SA Motivation for evaluating Wikidata and linked data ● Wikidata can be a datasource of cultural heritage objects • Increasing interest from cultural heritage institutions in sharing descriptions of their digital objects ● We are investigating linked data for innovating the process of aggregation of metadata: • Aggregation of linked data has been the subject of a case study • Schema.org is suitable for describing cultural heritage resources CC BY-SA
  • 8. Title here CC BY-SA Motivation for machine interoperability and data interpretability ● Linked data sources of cultural data are numerous but data is heterogeneous across them ● Effective and sustainable usage of these sources must be supported by automatic means • A minimum level of compliance with the Semantic Web is necessary CC BY-SA
  • 9. Title here CC BY-SA Use cases of linked data consumption addressed in this study CC BY-SA
  • 10. Title here CC BY-SA Our study setup (1/3) CC BY-SA
  • 13. Title here CC BY-SA CC BY-SA The linked data system
  • 14. Title here CC BY-SA Results CC BY-SA ● Wikidata’s RDF presents some difficulties for cross-domain applications ● Wikidata is using a very limited number of general data processing properties ○ most of the properties in use are labels for human users ● Wikidata has chosen to use properties from its own ontology instead of equivalent RDF, RDF-Schema, OWL or SKOS properties ○ Without human support, applications are unable to interpret Wikidata’s properties
  • 15. Title here CC BY-SA The other namespaces in use in Wikidata’s RDF output CC BY-SA Occurrences in the 11.798 Wikidata resources of our sample
  • 16. Title here CC BY-SA Results - general linked data processing CC BY-SA ● Wikidata makes limited use of rdf:type ○ It is used just to state that the RDF resource is an Item from the Wikibase ontology (http://wikiba.se/ontology#Item) ○ For further types, the property wdt:P31 is used. ● Not all Wikidata RDF predicate URIs are resolvable ○ In the case of property wdt:P31, it is stated in the data as http://www.wikidata.org/prop/direct/P31, which is not resolvable. The resolvable corresponding URI is http://www.wikidata.org/entity/P31 ○ These unresolvable URIs limit machine’s interpretation of the predicates
  • 17. Title here CC BY-SA Results - general linked data processing CC BY-SA ● In order to proceed with the experiment, we manually added alignment statements in our knowledge base ● In fact, most of the alignments are already recorded in Wikidata, but they are expressed using predicates from Wikidata’s namespaces ○ … limiting the interpretation by machines (The alignments are presented in a later slide)
  • 18. Title here CC BY-SA Results - acquiring Schema.org semantics from Wikidata CC BY-SA ● Equivalence and specialisation relations between classes and properties are used to infer (direct or infered) mappings between Wikidata and Schema.org ● We came across two obstacles. ○ For finding alignments, we faced again the non-resolvable URI’s ○ For navigating Wikidata’s class and property hierarchy, we had to manually add alignment statements in our knowledge base ■ Wikidata data properties are used to state the class and property hierarchy ● Adding additional alignments in our knowledge base was necessary
  • 19. Title here CC BY-SA Alignments added for enabling automatic data processing and ontology reasoning CC BY-SA
  • 20. Title here CC BY-SA Results - acquiring Schema.org semantics from Wikidata CC BY-SA ● For classes, we found 102 distinct ones 57% of which had alignments to Schema.org ○ 49% are direct alignments and 7,9% are alignments inherited from super classes ● For properties, we found 266 distinct ones 44% of which had alignments to Schema.org ○ only direct alignments were found for properties.
  • 21. Title here CC BY-SA Results - acquiring Schema.org semantics from Wikidata CC BY-SA These results are a good indicator that many applications would be able to make use of the structured data. The listing of the individual alignments found may be consulted online https://github.com/nfreire/data-aggregation-lab/blob/master/data-aggregation- casestudies/documentation/wikidata/SchemaOrg-ontology-alignments-listing.md
  • 22. Title here CC BY-SA Conclusions CC BY-SA ● Currently, a human operator must assist linked data applications to interpret Wikidata’s RDF ○ it requires training of human resources on Wikidata’s data model and its representation in RDF ○ The usage of predicates from Wikidata’s own ontology makes uninterpretable for data crawlers based on of properties for general data processing of the Semantic Web ● Another difficulty is the use of namespaces that are not resolvable for Wikidata’s properties ● ...but Wikidata contains enough alignment data to RDF, RDFS, OWL, SKOS and Schema.org: Machine interpretation of Wikidata is just a few steps away
  • 23. Title here CC BY-SA Ongoing and future work CC BY-SA ● At this time, we are analyzing the results of evaluating the domain- specific use case, which is using Wikidata data for input into Europeana's cultural heritage metadata. Our first hints are that Wikidata provides high enough quality for this case ● In future work, we expect to evaluate the linked data published by data providers of Europeana in terms of machine processing
  • 24. Thank you for your attention nuno.freire@tecnico.ulisboa.pt Netherlands, Public Domain 1660 - 1625, Rijksmuseum Anonymous Arrival of a Portuguese ship Acknowledgments Fundação para a Ciência e a Tecnologia (FCT): UID/CEC/50021/2013 European Commission contract number 30-CE-0885387/00-80.