SlideShare une entreprise Scribd logo
1  sur  45
Semantic Interoperability at
Europeana
Antoine Isaac
with slides from Hugo Manguinhas, Valentine Charles, Nuno Freire,
Juliane Stiller
Workshop on Semantic Interoperability for Multilingual DSIs
Brussels, 18 October 2018
Title here
CC BY-SACC BY-SA
Europeana is a rather big digital
culture effort
58 million digitized objects, from 3,700 institutions in 44 countries
Title here
CC BY-SACC BY-SA
The data Europeana holds
● Descriptive and technical metadata
● Thumbnails
As a rule, content is still served from our data partners
● Some content for specific projects
● newspapers text and images
● user-generated content (Europeana 1914-1918)
Title here
CC BY-SACC BY-SA
A network of data partners
● Data providers: Cultural heritage institutions providing content and metadata
to Europeana
● "Intermediate” Aggregators:
organizations or projects gathering
metadata and content for institutions
from a specific country, sector, or on a
specific domain (music, archaeology,
theater…) and making it available for
Europeana and other data consumers
Title here
CC BY-SACC BY-SA
Title here
CC BY-SA
Quality issues
Europeana Essentials
CC BY-SACC BY-SA
Title here
CC BY-SACC BY-SA
Europeana is diverse
58 million digitized objects, from 3,700 institutions in 44
countries
● Many different themes and types of objects
Books, newspapers, journals, letters, diaries, archival papers, paintings, maps, drawings, photographs,
music, spoken word, radio broadcasts, film, newsreels, television, fashion, sculpture, 3D objects, and
more
● Libraries, archives, museums have different ways to describe
objects. Even within a sector, big differences can be observed
● Heterogeneity makes quality issues worse
Title here
CC BY-SA
Multilinguism
Europeana Essentials
CC BY-SACC BY-SA
● Officially we get metadata in 44 languages
● But there are more languages used in individual
metadata fields
Title here
CC BY-SA
Europeana Essentials
CC BY-SACC BY-SA
Work by Péter Kiraly (Göttingen Research alliance)
http://144.76.218.178/europeana-
qa/languages.php?collectionId=all&field=aggregated
Title here
CC BY-SA
Europeana Essentials
CC BY-SACC BY-SA
Work by Péter Kiraly (Göttingen Research alliance)
http://144.76.218.178/europeana-
qa/languages.php?collectionId=all&field=aggregated
Title here
CC BY-SA
Multilinguism
Europeana Essentials
CC BY-SACC BY-SA
● Officially we get metadata in 44 languages
● But there are more languages used in individual
metadata fields
• Over 400 language codes
• E.g., 6 values in x-aramaic-latn - not a valid code by the way
• But the most common case is lack of language information!
France, Public Domain
1932, National Library of France
Agence de presse Mondial Photo-Presse.
Tournoi royal de motos à Londres :
changement d'une roue de side-car en
marche
How to make these
data work together?
1. Data Modeling for
interoperability and
richer data
Title here
CC BY-SACC BY-SA
Metadata conversion flows
● Mappings of metadata: the metadata comes to Europeana after one or two
(expert-crafted) mappings to "interoperability formats".
Title here
CC BY-SACC BY-SA
Following the Linked Open Data principles
http://vimeo.com/36752317
Title here
CC BY-SACC BY-SACC BY-SA
• To develop the open data ecosystem, facilitating better
communication between developers and publishers;
• To provide guidance to publishers, promoting the re-use of data;
• To foster trust in the data among developers
Data on the Web
Best Practices
Working Group
https://www.w3.org/2013/dwbp/
CC BY-SA
• Use terms from shared vocabularies, preferably standardized
ones
• Check that classes, properties, terms, elements or attributes
used to represent a dataset do not replicate those defined by
vocabularies used for other datasets.
• Or if you have to replicate, indicate mappings clearly
BP 15: Reuse vocabularies, preferably
standardized ones
CC BY-SA
• Accept that precise specs can enable automated reasoning but that
complex vocabularies require more effort to produce and hamper
reuse of data
• Minimize ontological commitment of your vocabulary – or seek to
minimize the commitment of others’ vocabularies
• Check examples of “softer” specs, e.g. Schema.org or SKOS
BP 16: Choose the right formalization
level
The Europeana Data Model (EDM)
CC BY-SA
An RDF-based model that reuses many vocabularies:
• DC
• SKOS
• OAI-ORE
• Web Annotation
• RDA
• FOAF
• WGS84
• ccRel
• ODRL/POE
• CIDOC CRM
• EBUcore
• DOAP
• SVCS
• DCAT
• ADMS…
W3C Data on the Web BP – Data Vocabularies
Complete list of elements at
https://github.com/europeana/corelib/wiki/EDMObjectTemplatesEuropeana
Title here
CC BY-SA
Title here
CC BY-SA
Europeana Essentials
CC BY-SA
A basic EDM example
CC BY-SA
Clavecin, Bartolomeo Cristofori
Cite de la Musique,
MIMO - Musical Instruments Museums Online|CC BY-NC-SA
Europeana Data Model example
Title here
CC BY-SACC BY-SA
A community driven model
• Involving experts from libraries, archives, museums and academics
• Adopting a collaborative, softer form of standardization
• Documenting the base model refering to community extensions
http://pro.europeana.eu/europeana-tech
Europeana Assembly General Meeting, Rijksmuseum,
Amsterdam, 2015
Title here
CC BY-SACC BY-SA
Extension in DM2E project (Digital Manuscripts to Europeana)
http://onto.dm2e.eu/schemas/dm2e
EDM enables specialization of classes
and properties.
This allows partners to define
extensions answering the needs of
specific communities.
Different semantic grains
(Some of) what it takes
CC BY-SA
• Re-using is easier when one has a cool-head approach to semantics
• Flexibility is required: we sometimes changed definitions because we
had some semantic overcommitment
Title here
CC BY-SACC BY-SA
Semantic interoperability also
requires a general effort on quality
We have set up a Data Quality Committee working on
recommendations for the community on:
○ Mandatory metadata elements for ingestion of EDM data
○ Metadata checking and normalization
○ Meaningful metadata values (in the context of use)
○ Coordination with other quality-related initiatives
http://pro.europeana.eu/get-involved/europeana-tech/data-quality-committee
How to make these data
work together?
2. Enriching metadata
France, Public Domain
1914, National Library of France
Agence de presse Meurisse
Concours de cycles nautiques sur le lac
d’Enghien : Berregent piloté par Austerling
Title here
CC BY-SACC BY-SA
Europeana Linked Data Strategy
Our lines of work
CC BY-SA
• The Europeana Data Model (EDM) offers a base for linking
metadata
• Aims at providing data as resources (with URIs!), not only strings
• Enables the development of a multilingual data environment
• We apply automatic enrichment to link object metadata to
reference datasets
• We encourage data providers to contribute their own links to
vocabularies
Designing a Multilingual Knowledge Graph as a Service for Cultural Heritage
Title here
CC BY-SACC BY-SA
CC BY-SA
Thumbnail
Descriptive Metadata
Link to data
provider
Rights
Title here
CC BY-SACC BY-SA
CC BY-SA
Links to contextual entities
Title here
CC BY-SACC BY-SA
Warning: multilingual enrichment is not
easy
Poisonous India or the Importance of a Semantic and Multilingual
Enrichment Strategy
Marlies Olensky, Juliane Stiller, Evelyn Dröge, MTSR 2012
http://link.springer.com/chapter/10.1007%2F978-3-642-35233-
1_25
Title here
CC BY-SACC BY-SA
Building a network of contextual
information
Europeana grows a “Semantic Layer” linking to contextual resources (e.g.
concepts, persons, places).
Diagram by Stefan Gradmann
Title here
CC BY-SACC BY-SA
Contextual entities in EDM
edm:Agent
foaf:name
skos:altLabel
rdaGr2:biographicalInformation
rdaGr2:dateOfBirth
skos:Concept
skos:prefLabel
skos:altLabel
skos:broader
skos:related
skos:definition….
edm:TimeSpan
skos:prefLabel
dcterms:isPartOf
edm:begin
edm:end
….
edm:Place
wgs84_pos:lat
wgs84_pos:long
skos:prefLabel
skos:note
dcterms:isPartOf….
Representing (real-world) entities related to an object
as fully fledged resources, not just strings
Europeana Essentials
CC BY-SA
Example: a concept from a
specialized thesaurus (MIMO)
CC BY-SA
Clavecin, Bartolomeo Cristofori
Cite de la Musique,
MIMO - Musical Instruments Museums Online|CC BY-NC-SA
Title here
CC BY-SACC BY-SA
Example: an AAT concept in EDM
edm:ProvidedCHO
Hourglass
urn:imss:instrument:401058
dc:type
skos:Concept
http://vocab.getty.edu/aat/3
00198626
skos:prefLabel
skos:prefLabel
skos:prefLabel
hourglasses@en
uurglazen@nl
reloj de las
horas@es
skos:broader
http://vocab.getty.edu/aat/300206197
=sandglasses
Title here
CC BY-SACC BY-SA
Europeana Linked Data Strategy
LOD Vocabularies currently recognized by Europeana in providers'
metadata
CC BY-SA
Designing a Multilingual Knowledge Graph as a Service for Cultural Heritage
Vocabulary URL
MIMO Concepts http://www.mimo-db.eu/
MIMO Instrument makers http://www.mimo-db.eu/
The Getty - Art & Architecture Thesaurus (AAT) http://vocab.getty.edu/
The Getty - Union List of Artist Names (ULAN) http://vocab.getty.edu/
Virtual International Authority File (VIAF) http://viaf.org/viaf/
Geonames http://sws.geonames.org/
IconClass http://iconclass.org/
Gemeinsame Normdatei (GND) http://d-nb.info/gnd
Israel Museum Jerusalem Concepts http://www.imj.org.il/imagine/thesaurus/objects/
Partage Plus concepts http://partage.vocnet.org/
data.europeana.eu WWI Concepts from Library of Congress
Subject Headings (LCSH) http://data.europeana.eu/concept/loc
Europeana Sounds Genres http://data.europeana.eu/concept/soundgenres/
EAGLE Material & Object Type http://www.eagle-network.eu/voc/
DISMARC Formats & Genres http://purl.org/dismarc/ns/
UDC http://udcdata.info/rdf/
UNESCO Thesaurus http://vocabularies.unesco.org/thesaurus/
Title here
CC BY-SACC BY-SA
Europeana Linked Data Strategy
Our lines of work
CC BY-SA
• The Europeana Data Model (EDM) offers a base for linking
metadata
• We apply automatic enrichment to link object metadata to
reference datasets
• We encourage data providers to contribute their own links to
vocabularies
• We encourage alignment activities between domain
vocabularies
Designing a Multilingual Knowledge Graph as a Service for Cultural Heritage
Title here
CC BY-SACC BY-SA
Encouraging (semi-) automatic vocabulary alignment
CC BY-SA
http://cultuurlink.beeldengeluid.nl
Title here
CC BY-SACC BY-SA
The Europeana Entity
Collection and API
Netherlands, Public Domain
1660 - 1625, Rijksmuseum
Anonymous
Arrival of a Portuguese ship
Title here
CC BY-SACC BY-SA
Europeana Linked Data Strategy
A strategy for Entities
CC BY-SA
We are building an "Entity Collection"
• A service that acts as a centralized point of reference and access to
data about contextual entities: places, agents (persons and
organizations), concepts...
• Caching and curating data from the wider Linked Open Data cloud
• A sort of Europeana "knowledge graph" with an API
• A service can be re-used by everyone in our community
Title here
CC BY-SACC BY-SA
Uses cases for the Entity Collection (1/2)
CC BY-SA
Improve user experience on Europeana services
● Findability: users can search with and for people, places and subjects, not only objects. In many
more languages, and with less ambiguity
● Contextualization: users see contextual information about cultural heritage objects. Entity Pages
group and present all assertions about an entity
● Exploration: Browsing along relationships between objects and entities and between entities
Semantic auto-
completion
Entity Pages Entity based facets
Europeana Food & Drink
Project
Title here
CC BY-SACC BY-SA
Uses cases for the Entity Collection (2/2)
CC BY-SA
Crowdsourcing
● Objects can be annotated with references to
entities of their context
Automatic enrichment of providers' metadata
● A controlled vocabulary to help recognize references to entities
Republication for reuse
● Entities can be republished as an open source to the community
Semantic and
Metadata annotations
Pundit Annotation Client
from Digital Manuscripts to
Euiropeana (DM2E)
Data currently in the Entity Collection
CC BY-SA
Mostly corresponding to a selection made for Europeana's Semantic
Enrichment
• Places
a subset of Geonames, corresponding to places which are part of European
countries and of some specific feature classes.
• Agents
a subset of DBpedia corresponding to most of the instances of dbp:Artist
with some exceptions, and integrated from 49 DBpedia language editions.
• Concepts
a subset of DBpedia corresponding to a selection concepts matching the
needs from Europeana Collections (e.g., WWI battles).
Europeana Sounds music genres (obtained from Wikidata)
Photo Consortium's photography vocabulary
• Organizations
Extracted from Europeana's CRM and aligned to Wikidata when possible
216,302
resources
1,572
resources
165,005
resources
1,077
resources
Title here
CC BY-SACC BY-SA
The Entity Collection
Contribution to multilingual coverage
Entities effectively used to enrich Europeana Objects
Entities present in the Entity Collection
Selecting data sources
CC BY-SA
An intellectual effort by data experts, leveraging the following criteria:
• Availability and access: open license, published on the web as linked
data
• Granularity, size and coverage: multilingual data, helping to answer key
user needs for Europeana's CH collections. Too generic or large datasets
can create too much ambiguity for the simple processes we have (e.g.
enrichment)
• Quality: intrinsic aspects like correctness of representation (data
structures)
• Connectivity: good data sources are well-connected internally and
externally to other datasets
The Entity Collection and API
DBpedia resource for “Mozart” in our data
CC BY-SA
Coreference links to 6 other
datasets
(e.g. Freebase, Wikidata)
Inter-linking information
Preferred labels for 48
languages
Entity API - suggest method
CC BY-SA
/entities/suggest.json?text=neo
The Europeana Entity Collection – Where we stand
CC BY-SA
• We've made enough progress to release a first version of the
Entity Collection and its API, used in Europeana's production
services.
• But there are still challenges and decisions to ensure
consistency and relevance over time:
• Expand data coverage with new data sources for, e.g., events
• Employ the EC to better enrich Europeana object metadata
• Enhance discoverability, especially for search engines, e.g. via
Schema.org publication
Title here
CC BY-SACC BY-SA
Title here
CC BY-SA
Name of image | Creator
Providing organization|
Country, licence
Name of image | Creator
Providing organization| Country, licence
antoine.isaac@europeana.eu
@antoine_isaac

Contenu connexe

Tendances

Tendances (20)

AAC Education Session
AAC Education Session AAC Education Session
AAC Education Session
 
Europeana - American Art Collaborative LOD Meeting
Europeana - American Art Collaborative LOD MeetingEuropeana - American Art Collaborative LOD Meeting
Europeana - American Art Collaborative LOD Meeting
 
W3C Library Linked Data Incubator Group - 2011
W3C Library Linked Data Incubator Group  - 2011W3C Library Linked Data Incubator Group  - 2011
W3C Library Linked Data Incubator Group - 2011
 
EuropeanaTech update - Europeana AGM 2015
EuropeanaTech update - Europeana AGM 2015EuropeanaTech update - Europeana AGM 2015
EuropeanaTech update - Europeana AGM 2015
 
Europeana and the relevance of the DM2E results
Europeana and the relevance of the DM2E resultsEuropeana and the relevance of the DM2E results
Europeana and the relevance of the DM2E results
 
Data modelling at Europeana and DM2E - SMW13
Data modelling at Europeana and DM2E - SMW13Data modelling at Europeana and DM2E - SMW13
Data modelling at Europeana and DM2E - SMW13
 
Semantic Web, Linked Data: the Europeana case(s)
Semantic Web, Linked Data: the Europeana case(s)Semantic Web, Linked Data: the Europeana case(s)
Semantic Web, Linked Data: the Europeana case(s)
 
European databases in cultural heritage: making connections
European databases in cultural heritage: making connectionsEuropean databases in cultural heritage: making connections
European databases in cultural heritage: making connections
 
Data scale and diversity issues at Europeana
Data scale and diversity issues at EuropeanaData scale and diversity issues at Europeana
Data scale and diversity issues at Europeana
 
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
 
EDM - American Art Collaborative LOD Meeting
EDM - American Art Collaborative LOD MeetingEDM - American Art Collaborative LOD Meeting
EDM - American Art Collaborative LOD Meeting
 
Linked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approachLinked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approach
 
Modelling and exchanging annotations
Modelling and exchanging annotationsModelling and exchanging annotations
Modelling and exchanging annotations
 
Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017Europeana Research Panel DH Benelux 2017
Europeana Research Panel DH Benelux 2017
 
Introduction to CARARE
Introduction to CARAREIntroduction to CARARE
Introduction to CARARE
 
LDBC 19 November 2013
LDBC 19 November 2013  LDBC 19 November 2013
LDBC 19 November 2013
 
Local content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providersLocal content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providers
 
Achieving Interoperability between the CARARE Schema for Monuments and Sites ...
Achieving Interoperability between the CARARE Schema for Monuments and Sites ...Achieving Interoperability between the CARARE Schema for Monuments and Sites ...
Achieving Interoperability between the CARARE Schema for Monuments and Sites ...
 
Validation of Europeana data: application profile, OWL ontology, or else?
Validation of Europeana data: application profile, OWL ontology, or else?Validation of Europeana data: application profile, OWL ontology, or else?
Validation of Europeana data: application profile, OWL ontology, or else?
 
Fondly Collisions: Archival hierarchy and the Europeana Data Model
Fondly Collisions: Archival hierarchy and the Europeana Data Model   Fondly Collisions: Archival hierarchy and the Europeana Data Model
Fondly Collisions: Archival hierarchy and the Europeana Data Model
 

Similaire à Semantic Interoperability at Europeana - MultilingualDSIs2018

Evaluation of Schema.org for Aggregation of Cultural Heritage Metadata
Evaluation of Schema.org for Aggregation of Cultural Heritage MetadataEvaluation of Schema.org for Aggregation of Cultural Heritage Metadata
Evaluation of Schema.org for Aggregation of Cultural Heritage Metadata
Nuno Freire
 
Aggregation of Linked Data A case study in the cultural heritage domain
Aggregation of Linked Data A case study in the cultural heritage domainAggregation of Linked Data A case study in the cultural heritage domain
Aggregation of Linked Data A case study in the cultural heritage domain
Nuno Freire
 

Similaire à Semantic Interoperability at Europeana - MultilingualDSIs2018 (20)

Europeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) caseEuropeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) case
 
The Europeana Data Model - TPDL2018
The Europeana Data Model - TPDL2018The Europeana Data Model - TPDL2018
The Europeana Data Model - TPDL2018
 
Building a Framework for Semantic Cultural Heritage Data
Building a Framework for Semantic Cultural Heritage DataBuilding a Framework for Semantic Cultural Heritage Data
Building a Framework for Semantic Cultural Heritage Data
 
Evaluation of Schema.org for Aggregation of Cultural Heritage Metadata
Evaluation of Schema.org for Aggregation of Cultural Heritage MetadataEvaluation of Schema.org for Aggregation of Cultural Heritage Metadata
Evaluation of Schema.org for Aggregation of Cultural Heritage Metadata
 
Next Generation Research with Europeana: the Humanities and Cultural Heritage...
Next Generation Research with Europeana: the Humanities and Cultural Heritage...Next Generation Research with Europeana: the Humanities and Cultural Heritage...
Next Generation Research with Europeana: the Humanities and Cultural Heritage...
 
Harvesting&Metadata Enrich Project EVA 2009
Harvesting&Metadata Enrich Project   EVA 2009Harvesting&Metadata Enrich Project   EVA 2009
Harvesting&Metadata Enrich Project EVA 2009
 
The Europeana Community: Semantics and Cultural Heritage Data
The Europeana Community: Semantics and Cultural Heritage DataThe Europeana Community: Semantics and Cultural Heritage Data
The Europeana Community: Semantics and Cultural Heritage Data
 
Metadata aggregation of IIIF Resources at Europeana: status and plans
Metadata aggregation of IIIF Resources at Europeana: status and plansMetadata aggregation of IIIF Resources at Europeana: status and plans
Metadata aggregation of IIIF Resources at Europeana: status and plans
 
Lapsi warsaw oct_2011
Lapsi warsaw oct_2011Lapsi warsaw oct_2011
Lapsi warsaw oct_2011
 
UKSG webinar: Introduction to metadata quality – the approach of Europeana Co...
UKSG webinar: Introduction to metadata quality – the approach of Europeana Co...UKSG webinar: Introduction to metadata quality – the approach of Europeana Co...
UKSG webinar: Introduction to metadata quality – the approach of Europeana Co...
 
A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)A Semantic Multimedia Web (Part 3)
A Semantic Multimedia Web (Part 3)
 
Aggregation of Linked Data A case study in the cultural heritage domain
Aggregation of Linked Data A case study in the cultural heritage domainAggregation of Linked Data A case study in the cultural heritage domain
Aggregation of Linked Data A case study in the cultural heritage domain
 
Building an ecosystem of networked references
Building an ecosystem of networked referencesBuilding an ecosystem of networked references
Building an ecosystem of networked references
 
Everything you need to know about Europeana
Everything you need to know about EuropeanaEverything you need to know about Europeana
Everything you need to know about Europeana
 
A Cultural Heritage Repository as Source for Learning Materials
A Cultural Heritage Repository as Source for Learning MaterialsA Cultural Heritage Repository as Source for Learning Materials
A Cultural Heritage Repository as Source for Learning Materials
 
Open for Business - Open Archives, OpenURL, RSS and the Dublin Core
Open for Business - Open Archives, OpenURL, RSS and the Dublin CoreOpen for Business - Open Archives, OpenURL, RSS and the Dublin Core
Open for Business - Open Archives, OpenURL, RSS and the Dublin Core
 
A portrait of Europeana as a Linked Open Data case
A portrait of Europeana as a Linked Open Data caseA portrait of Europeana as a Linked Open Data case
A portrait of Europeana as a Linked Open Data case
 
Open Data Masterclass - Europeana and LOD
Open Data Masterclass - Europeana and LODOpen Data Masterclass - Europeana and LOD
Open Data Masterclass - Europeana and LOD
 
Sharing 3D Cultural Heritage
Sharing 3D Cultural HeritageSharing 3D Cultural Heritage
Sharing 3D Cultural Heritage
 
Opening Digitized Newspapers Corpora: Europeana’s Full-text Data Interoperabi...
Opening Digitized Newspapers Corpora: Europeana’s Full-text Data Interoperabi...Opening Digitized Newspapers Corpora: Europeana’s Full-text Data Interoperabi...
Opening Digitized Newspapers Corpora: Europeana’s Full-text Data Interoperabi...
 

Plus de Antoine Isaac

Plus de Antoine Isaac (14)

Addressing multilingual challenges at Europeana: An update - DCMI 2021
Addressing multilingual challenges at Europeana: An update - DCMI 2021Addressing multilingual challenges at Europeana: An update - DCMI 2021
Addressing multilingual challenges at Europeana: An update - DCMI 2021
 
Le Cadre de publication d'Europeana
Le Cadre de publication d'EuropeanaLe Cadre de publication d'Europeana
Le Cadre de publication d'Europeana
 
IIIF and the Europeana mission
IIIF and the Europeana missionIIIF and the Europeana mission
IIIF and the Europeana mission
 
Lightweight rights modeling and linked data publication for online cultural h...
Lightweight rights modeling and linked data publication for online cultural h...Lightweight rights modeling and linked data publication for online cultural h...
Lightweight rights modeling and linked data publication for online cultural h...
 
Europeana et IIIF
Europeana et IIIFEuropeana et IIIF
Europeana et IIIF
 
Isaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data VocabulariesIsaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data Vocabularies
 
Europeana APIs
Europeana APIsEuropeana APIs
Europeana APIs
 
Modelling annotations for Europeana and related projects - DARIAH-EU WS
Modelling annotations for Europeana and related projects - DARIAH-EU WSModelling annotations for Europeana and related projects - DARIAH-EU WS
Modelling annotations for Europeana and related projects - DARIAH-EU WS
 
Classification schemes, thesauri and other Knowledge Organization Systems - a...
Classification schemes, thesauri and other Knowledge Organization Systems - a...Classification schemes, thesauri and other Knowledge Organization Systems - a...
Classification schemes, thesauri and other Knowledge Organization Systems - a...
 
Multilingual challenges for accessing digitized culture online - Riga Summit 15
Multilingual challenges for accessing digitized culture online - Riga Summit 15Multilingual challenges for accessing digitized culture online - Riga Summit 15
Multilingual challenges for accessing digitized culture online - Riga Summit 15
 
Europeana DSI - LT-Accelerate 14
Europeana DSI -  LT-Accelerate 14Europeana DSI -  LT-Accelerate 14
Europeana DSI - LT-Accelerate 14
 
EIFL 2014 - Linked Open Data
EIFL 2014 - Linked Open DataEIFL 2014 - Linked Open Data
EIFL 2014 - Linked Open Data
 
Enrichment and Europeana
Enrichment and EuropeanaEnrichment and Europeana
Enrichment and Europeana
 
Challenges for the Language Technology Industry
Challenges for the Language Technology IndustryChallenges for the Language Technology Industry
Challenges for the Language Technology Industry
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 

Semantic Interoperability at Europeana - MultilingualDSIs2018

  • 1. Semantic Interoperability at Europeana Antoine Isaac with slides from Hugo Manguinhas, Valentine Charles, Nuno Freire, Juliane Stiller Workshop on Semantic Interoperability for Multilingual DSIs Brussels, 18 October 2018
  • 2. Title here CC BY-SACC BY-SA Europeana is a rather big digital culture effort 58 million digitized objects, from 3,700 institutions in 44 countries
  • 3. Title here CC BY-SACC BY-SA The data Europeana holds ● Descriptive and technical metadata ● Thumbnails As a rule, content is still served from our data partners ● Some content for specific projects ● newspapers text and images ● user-generated content (Europeana 1914-1918)
  • 4. Title here CC BY-SACC BY-SA A network of data partners ● Data providers: Cultural heritage institutions providing content and metadata to Europeana ● "Intermediate” Aggregators: organizations or projects gathering metadata and content for institutions from a specific country, sector, or on a specific domain (music, archaeology, theater…) and making it available for Europeana and other data consumers
  • 5. Title here CC BY-SACC BY-SA Title here CC BY-SA Quality issues Europeana Essentials CC BY-SACC BY-SA
  • 6. Title here CC BY-SACC BY-SA Europeana is diverse 58 million digitized objects, from 3,700 institutions in 44 countries ● Many different themes and types of objects Books, newspapers, journals, letters, diaries, archival papers, paintings, maps, drawings, photographs, music, spoken word, radio broadcasts, film, newsreels, television, fashion, sculpture, 3D objects, and more ● Libraries, archives, museums have different ways to describe objects. Even within a sector, big differences can be observed ● Heterogeneity makes quality issues worse
  • 7. Title here CC BY-SA Multilinguism Europeana Essentials CC BY-SACC BY-SA ● Officially we get metadata in 44 languages ● But there are more languages used in individual metadata fields
  • 8. Title here CC BY-SA Europeana Essentials CC BY-SACC BY-SA Work by Péter Kiraly (Göttingen Research alliance) http://144.76.218.178/europeana- qa/languages.php?collectionId=all&field=aggregated
  • 9. Title here CC BY-SA Europeana Essentials CC BY-SACC BY-SA Work by Péter Kiraly (Göttingen Research alliance) http://144.76.218.178/europeana- qa/languages.php?collectionId=all&field=aggregated
  • 10. Title here CC BY-SA Multilinguism Europeana Essentials CC BY-SACC BY-SA ● Officially we get metadata in 44 languages ● But there are more languages used in individual metadata fields • Over 400 language codes • E.g., 6 values in x-aramaic-latn - not a valid code by the way • But the most common case is lack of language information!
  • 11. France, Public Domain 1932, National Library of France Agence de presse Mondial Photo-Presse. Tournoi royal de motos à Londres : changement d'une roue de side-car en marche How to make these data work together? 1. Data Modeling for interoperability and richer data
  • 12. Title here CC BY-SACC BY-SA Metadata conversion flows ● Mappings of metadata: the metadata comes to Europeana after one or two (expert-crafted) mappings to "interoperability formats".
  • 13. Title here CC BY-SACC BY-SA Following the Linked Open Data principles http://vimeo.com/36752317
  • 14. Title here CC BY-SACC BY-SACC BY-SA • To develop the open data ecosystem, facilitating better communication between developers and publishers; • To provide guidance to publishers, promoting the re-use of data; • To foster trust in the data among developers Data on the Web Best Practices Working Group https://www.w3.org/2013/dwbp/
  • 15. CC BY-SA • Use terms from shared vocabularies, preferably standardized ones • Check that classes, properties, terms, elements or attributes used to represent a dataset do not replicate those defined by vocabularies used for other datasets. • Or if you have to replicate, indicate mappings clearly BP 15: Reuse vocabularies, preferably standardized ones
  • 16. CC BY-SA • Accept that precise specs can enable automated reasoning but that complex vocabularies require more effort to produce and hamper reuse of data • Minimize ontological commitment of your vocabulary – or seek to minimize the commitment of others’ vocabularies • Check examples of “softer” specs, e.g. Schema.org or SKOS BP 16: Choose the right formalization level
  • 17. The Europeana Data Model (EDM) CC BY-SA An RDF-based model that reuses many vocabularies: • DC • SKOS • OAI-ORE • Web Annotation • RDA • FOAF • WGS84 • ccRel • ODRL/POE • CIDOC CRM • EBUcore • DOAP • SVCS • DCAT • ADMS… W3C Data on the Web BP – Data Vocabularies Complete list of elements at https://github.com/europeana/corelib/wiki/EDMObjectTemplatesEuropeana
  • 18. Title here CC BY-SA Title here CC BY-SA Europeana Essentials CC BY-SA A basic EDM example CC BY-SA Clavecin, Bartolomeo Cristofori Cite de la Musique, MIMO - Musical Instruments Museums Online|CC BY-NC-SA Europeana Data Model example
  • 19. Title here CC BY-SACC BY-SA A community driven model • Involving experts from libraries, archives, museums and academics • Adopting a collaborative, softer form of standardization • Documenting the base model refering to community extensions http://pro.europeana.eu/europeana-tech Europeana Assembly General Meeting, Rijksmuseum, Amsterdam, 2015
  • 20. Title here CC BY-SACC BY-SA Extension in DM2E project (Digital Manuscripts to Europeana) http://onto.dm2e.eu/schemas/dm2e EDM enables specialization of classes and properties. This allows partners to define extensions answering the needs of specific communities. Different semantic grains
  • 21. (Some of) what it takes CC BY-SA • Re-using is easier when one has a cool-head approach to semantics • Flexibility is required: we sometimes changed definitions because we had some semantic overcommitment
  • 22. Title here CC BY-SACC BY-SA Semantic interoperability also requires a general effort on quality We have set up a Data Quality Committee working on recommendations for the community on: ○ Mandatory metadata elements for ingestion of EDM data ○ Metadata checking and normalization ○ Meaningful metadata values (in the context of use) ○ Coordination with other quality-related initiatives http://pro.europeana.eu/get-involved/europeana-tech/data-quality-committee
  • 23. How to make these data work together? 2. Enriching metadata France, Public Domain 1914, National Library of France Agence de presse Meurisse Concours de cycles nautiques sur le lac d’Enghien : Berregent piloté par Austerling
  • 24. Title here CC BY-SACC BY-SA Europeana Linked Data Strategy Our lines of work CC BY-SA • The Europeana Data Model (EDM) offers a base for linking metadata • Aims at providing data as resources (with URIs!), not only strings • Enables the development of a multilingual data environment • We apply automatic enrichment to link object metadata to reference datasets • We encourage data providers to contribute their own links to vocabularies Designing a Multilingual Knowledge Graph as a Service for Cultural Heritage
  • 25. Title here CC BY-SACC BY-SA CC BY-SA Thumbnail Descriptive Metadata Link to data provider Rights
  • 26. Title here CC BY-SACC BY-SA CC BY-SA Links to contextual entities
  • 27. Title here CC BY-SACC BY-SA Warning: multilingual enrichment is not easy Poisonous India or the Importance of a Semantic and Multilingual Enrichment Strategy Marlies Olensky, Juliane Stiller, Evelyn Dröge, MTSR 2012 http://link.springer.com/chapter/10.1007%2F978-3-642-35233- 1_25
  • 28. Title here CC BY-SACC BY-SA Building a network of contextual information Europeana grows a “Semantic Layer” linking to contextual resources (e.g. concepts, persons, places). Diagram by Stefan Gradmann
  • 29. Title here CC BY-SACC BY-SA Contextual entities in EDM edm:Agent foaf:name skos:altLabel rdaGr2:biographicalInformation rdaGr2:dateOfBirth skos:Concept skos:prefLabel skos:altLabel skos:broader skos:related skos:definition…. edm:TimeSpan skos:prefLabel dcterms:isPartOf edm:begin edm:end …. edm:Place wgs84_pos:lat wgs84_pos:long skos:prefLabel skos:note dcterms:isPartOf…. Representing (real-world) entities related to an object as fully fledged resources, not just strings
  • 30. Europeana Essentials CC BY-SA Example: a concept from a specialized thesaurus (MIMO) CC BY-SA Clavecin, Bartolomeo Cristofori Cite de la Musique, MIMO - Musical Instruments Museums Online|CC BY-NC-SA
  • 31. Title here CC BY-SACC BY-SA Example: an AAT concept in EDM edm:ProvidedCHO Hourglass urn:imss:instrument:401058 dc:type skos:Concept http://vocab.getty.edu/aat/3 00198626 skos:prefLabel skos:prefLabel skos:prefLabel hourglasses@en uurglazen@nl reloj de las horas@es skos:broader http://vocab.getty.edu/aat/300206197 =sandglasses
  • 32. Title here CC BY-SACC BY-SA Europeana Linked Data Strategy LOD Vocabularies currently recognized by Europeana in providers' metadata CC BY-SA Designing a Multilingual Knowledge Graph as a Service for Cultural Heritage Vocabulary URL MIMO Concepts http://www.mimo-db.eu/ MIMO Instrument makers http://www.mimo-db.eu/ The Getty - Art & Architecture Thesaurus (AAT) http://vocab.getty.edu/ The Getty - Union List of Artist Names (ULAN) http://vocab.getty.edu/ Virtual International Authority File (VIAF) http://viaf.org/viaf/ Geonames http://sws.geonames.org/ IconClass http://iconclass.org/ Gemeinsame Normdatei (GND) http://d-nb.info/gnd Israel Museum Jerusalem Concepts http://www.imj.org.il/imagine/thesaurus/objects/ Partage Plus concepts http://partage.vocnet.org/ data.europeana.eu WWI Concepts from Library of Congress Subject Headings (LCSH) http://data.europeana.eu/concept/loc Europeana Sounds Genres http://data.europeana.eu/concept/soundgenres/ EAGLE Material & Object Type http://www.eagle-network.eu/voc/ DISMARC Formats & Genres http://purl.org/dismarc/ns/ UDC http://udcdata.info/rdf/ UNESCO Thesaurus http://vocabularies.unesco.org/thesaurus/
  • 33. Title here CC BY-SACC BY-SA Europeana Linked Data Strategy Our lines of work CC BY-SA • The Europeana Data Model (EDM) offers a base for linking metadata • We apply automatic enrichment to link object metadata to reference datasets • We encourage data providers to contribute their own links to vocabularies • We encourage alignment activities between domain vocabularies Designing a Multilingual Knowledge Graph as a Service for Cultural Heritage
  • 34. Title here CC BY-SACC BY-SA Encouraging (semi-) automatic vocabulary alignment CC BY-SA http://cultuurlink.beeldengeluid.nl
  • 35. Title here CC BY-SACC BY-SA The Europeana Entity Collection and API Netherlands, Public Domain 1660 - 1625, Rijksmuseum Anonymous Arrival of a Portuguese ship
  • 36. Title here CC BY-SACC BY-SA Europeana Linked Data Strategy A strategy for Entities CC BY-SA We are building an "Entity Collection" • A service that acts as a centralized point of reference and access to data about contextual entities: places, agents (persons and organizations), concepts... • Caching and curating data from the wider Linked Open Data cloud • A sort of Europeana "knowledge graph" with an API • A service can be re-used by everyone in our community
  • 37. Title here CC BY-SACC BY-SA Uses cases for the Entity Collection (1/2) CC BY-SA Improve user experience on Europeana services ● Findability: users can search with and for people, places and subjects, not only objects. In many more languages, and with less ambiguity ● Contextualization: users see contextual information about cultural heritage objects. Entity Pages group and present all assertions about an entity ● Exploration: Browsing along relationships between objects and entities and between entities Semantic auto- completion Entity Pages Entity based facets Europeana Food & Drink Project
  • 38. Title here CC BY-SACC BY-SA Uses cases for the Entity Collection (2/2) CC BY-SA Crowdsourcing ● Objects can be annotated with references to entities of their context Automatic enrichment of providers' metadata ● A controlled vocabulary to help recognize references to entities Republication for reuse ● Entities can be republished as an open source to the community Semantic and Metadata annotations Pundit Annotation Client from Digital Manuscripts to Euiropeana (DM2E)
  • 39. Data currently in the Entity Collection CC BY-SA Mostly corresponding to a selection made for Europeana's Semantic Enrichment • Places a subset of Geonames, corresponding to places which are part of European countries and of some specific feature classes. • Agents a subset of DBpedia corresponding to most of the instances of dbp:Artist with some exceptions, and integrated from 49 DBpedia language editions. • Concepts a subset of DBpedia corresponding to a selection concepts matching the needs from Europeana Collections (e.g., WWI battles). Europeana Sounds music genres (obtained from Wikidata) Photo Consortium's photography vocabulary • Organizations Extracted from Europeana's CRM and aligned to Wikidata when possible 216,302 resources 1,572 resources 165,005 resources 1,077 resources
  • 40. Title here CC BY-SACC BY-SA The Entity Collection Contribution to multilingual coverage Entities effectively used to enrich Europeana Objects Entities present in the Entity Collection
  • 41. Selecting data sources CC BY-SA An intellectual effort by data experts, leveraging the following criteria: • Availability and access: open license, published on the web as linked data • Granularity, size and coverage: multilingual data, helping to answer key user needs for Europeana's CH collections. Too generic or large datasets can create too much ambiguity for the simple processes we have (e.g. enrichment) • Quality: intrinsic aspects like correctness of representation (data structures) • Connectivity: good data sources are well-connected internally and externally to other datasets
  • 42. The Entity Collection and API DBpedia resource for “Mozart” in our data CC BY-SA Coreference links to 6 other datasets (e.g. Freebase, Wikidata) Inter-linking information Preferred labels for 48 languages
  • 43. Entity API - suggest method CC BY-SA /entities/suggest.json?text=neo
  • 44. The Europeana Entity Collection – Where we stand CC BY-SA • We've made enough progress to release a first version of the Entity Collection and its API, used in Europeana's production services. • But there are still challenges and decisions to ensure consistency and relevance over time: • Expand data coverage with new data sources for, e.g., events • Employ the EC to better enrich Europeana object metadata • Enhance discoverability, especially for search engines, e.g. via Schema.org publication
  • 45. Title here CC BY-SACC BY-SA Title here CC BY-SA Name of image | Creator Providing organization| Country, licence Name of image | Creator Providing organization| Country, licence antoine.isaac@europeana.eu @antoine_isaac