SlideShare une entreprise Scribd logo
1  sur  20
Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
Hugo Manguinhas, Valentine Charles, Antoine Isaac| Europeana Foundation
Tom Miles| The British Library
Aude Lima| The Centre de Recherche en Ethnomusicologie
Ariane Néroulidis, Véronique Ginouvès| The Maison Méditerranéenne des Sciences de l'Homme
Dimitra Atsidis, Maarten Brinkerink| Netherlands Institute for Sound and Vision
Michiel Hildebrand| Spinque B.V.
Sergiu Gordea| Austrian Institute of Technology
What is Europeana?
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
We aggregate metadata:
• From all EU countries
• ~3,500 galleries, libraries,
archives and museums
• More than 52M objects
• In about 50 languages
• Huge amount of references to
places, agents, concepts, time
Europeana aggregation infrastructure
Europeana| CC BY-SA
The Platform for Europe’s Digital Cultural Heritage
The Europeana Sounds project
Europeana Sounds aims to increase the amount of audio
content available via Europeana
• also improving geographical and thematic coverage
Apart from aggregation, it improves discovery and use of
audio content, by enriching metadata through innovative
methods
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
The scope of the experiment
• Evaluate the use of a semi-automatic tool like CultuurLink
for a concrete vocabulary alignment case, and
• Assess the coverage of the MIMO vocabulary for enriching
Europeana Sounds datasets
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
About the MIMO vocabulary
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
• A multilingual controlled vocabulary of musical instruments
• Developed within the Musical Instruments Museums Online project
which gathered some of Europe's most important musical instruments
museums
Why MIMO?
• A significant part of the subjects present in Europeana Sounds
collections refer to musical instruments
• Good coverage
• gathers a total of 3121 musical instruments used by professionals
such as Hornbostel-Sachs (641)
• contains terms in 8 different languages (English, French, Polish,
Catalan, Dutch, Italian, Swedish, German)
• Technically available on the Web
• Follows the Linked Data best practices and recipes (RDF, SKOS,
content negotiation)
• Openly available (CCO)
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
Overview of MIMO language coverage
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
What is CultuurLink?
• Semi-automatic Vocabulary Alignment Tool
• Successor to EuropeanaConnect's Amalgame
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
http://cultuurlink.beeldengeluid.nl
Why CultuurLink?
• Freely available
• as an online open service that any user can use
• Users have the ability to design and experiment with different
alignment strategies
• helps the task of discovering new alignments between two vocabularies
• users can define and combine strategies that apply different techniques or
parameterizations
• Manual control
• alignments are identified through an automatic means but strategies are
designed by users
• users can decide which alignments are correct and can assign a specific
meaning (e.g. skos:exactMatch, skos:related, skos:broadMatch)
• User friendly
• allows non-technical savvy users to easily perform fairly complex tasks
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
The participants and their collections
• The British Library (BL) participated with 3 collections:
• A selection of Asian instruments (1,099 records) from the "Colin Huehns
Asia Collection"
• a selection from the “Peter Cooke Uganda Collection” (1,312 records)
• and the “Keith Summers English Folk Music Collection” (1,326 records)
• The Centre de Recherche en Ethnomusicologie (CREM)
• participated with a test collection of 36 records published in the CD
“Musical Instruments of the World”
• The Maison Méditerranéenne des Sciences de l'Homme (MMSH)
• participated with a collection of 25 records about folk music
• The Netherlands Institute of Sound and Vision (NISV)
• participated with a collection of 6,608 records containing commercial 78
rpm records (Handelsplaten) from different genres like light music, classical
music and opera.
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
Alignment of vocabulary terms
We decided to focus on the vocabulary terms within the subject
fields of the metadata as opposed to aligning the full vocabulary
used by the providing institution, because:
• not available for use outside the organization and/or in a data
structure that suits a vocabulary alignment tool
• we preferred to report on alignments for the subjects used in
the source datasets and not on all possible subjects
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
What have we done?
For each collection we:
• Extracted a SKOS vocabulary out of the subject terms found in
the object metadata
• Set-up a permanent session on CultuurLink
• Asked providers to perform the alignments
• Collected and assessed the alignments and feedback obtained
from the Data Providers
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
Concept definition obtained from the MMSH dataset
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
<skos:ConceptScheme
rdf:about="http://www.europeanasounds.eu/data/mmsh/concepts#ConceptScheme">
</skos:ConceptScheme>
<skos:Concept rdf:ID="grelot">
<skos:inScheme rdf:resource="#ConceptScheme"/>
<skos:prefLabel>grelot</skos:prefLabel>
<skos:note rdf:resource="http://mint-
projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma-
num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9800"/>
<skos:note rdf:resource="http://mint-
projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma-
num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9775"/>
<skos:note rdf:resource="http://mint-
projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma-
num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9801"/>
<skos:note rdf:resource="http://mint-
projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma-
num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9768"/>
<skos:note rdf:resource="http://mint-
projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma-
num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9798"/>
<skos:note rdf:resource="http://mint-
projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma-
num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9788"/>
</skos:Concept>
URIs of the records are kept as
skos:notes
Text found in dc:subject
The alignments obtained from CultuurLink
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#" >
<rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#guitare">
<skos:exactMatch rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/3237"/>
<owl:differentFrom rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/5137"/>
</rdf:Description>
<rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#flûte">
<skos:exactMatch rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/3955"/>
</rdf:Description>
<rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#grelot">
<skos:exactMatch rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/2873"/>
</rdf:Description>
<rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#ban">
<owl:differentFrom rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/2498"/>
</rdf:Description>
<rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#violon">
<skos:exactMatch rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/3573"/>
</rdf:Description>
</rdf:RDF>
MIMO concept
Subject term
Alignments identified by the data
provider for this subject
The quantitative results of the experiment
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
Findings identified when aligning with CultuurLink
(1/2)
• Applying an exact string matching of preferred labels is
sufficient to align ~50%
• Also incorrect alignments were identified due to polysemy
reasons
• e.g. “ban” or “zang” which means singing or song matching the
instrument “zang”, a sort of cymbals or clapper bells
• Applying match against labels in any language turned out be
very successful on finding matches based on vernacular terms
• But also increased the number of irrelevant alignments
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
Findings identified when aligning with CultuurLink
(2/2)
More elaborate strategies were found very helpful to discover more
alignments:
• by using a less restrictive string matching function like “contains” or
“startsWith”, to surface “broader” or “narrower” relations
• by activating stemming
• e.g. “Trompet” was aligned with “Trompetten” and “Accordeon” with
“Accordeons”, both in Dutch
• by applying fuzzy matching both with max distance of 1 and 2
• The “NOT A” functionality was found crucial to iteratively refine the
strategy
Using such strategies also revealed some quality issues in the
source metadata, such as:
• misspellings and unrecognizable/doubtful terms
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
What about MIMO?
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
In general the Data Providers found MIMO:
• Good coverage of musical instruments
• Good language coverage comparing to their local vocabulary
• Simplified hierarchy allowing it to be understandable and practical for
non musicologists
• Includes updated families treating both electronic instruments and tools
that are presented in contemporary music
• Helpful concept definitions
However,
• Lacks concepts to describe voice (texture, mechanism, etc.)
• But may be enriched by the DOREMUS project with vocal terms from the IAML
mediums of performance thesaurus
• Centred on occidental classical music structure
Quick demo
http://cultuurlink.beeldengeluid.nl
NKOS16 - Linking subject labels in Cultural Heritage
Metadata to MIMO vocabulary using CultuurLink
CC BY-SA
Thank you!

Contenu connexe

Tendances

LIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectLIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectEuropeana Newspapers
 
Europeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
Europeana Network Association Members Council Meeting, Copenhagen by Max KaiserEuropeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
Europeana Network Association Members Council Meeting, Copenhagen by Max KaiserEuropeana
 
Eun lre brussels_winer20100616
Eun lre brussels_winer20100616Eun lre brussels_winer20100616
Eun lre brussels_winer20100616Dov Winer
 
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...The European Library
 
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...IMPACT Centre of Competence
 
The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012Europeana Newspapers
 
The Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final EventThe Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final EventEuropeana Newspapers
 
LDBC 19 November 2013
LDBC 19 November 2013  LDBC 19 November 2013
LDBC 19 November 2013 Europeana
 
Ariadne Booklet: The Way Forward to Digital Archaeology in Europe
Ariadne Booklet: The Way Forward to Digital Archaeology in EuropeAriadne Booklet: The Way Forward to Digital Archaeology in Europe
Ariadne Booklet: The Way Forward to Digital Archaeology in Europeariadnenetwork
 
The European(a) Newspapers Project
The European(a) Newspapers ProjectThe European(a) Newspapers Project
The European(a) Newspapers ProjectEuropeana Newspapers
 

Tendances (13)

LIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectLIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers Project
 
Europeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
Europeana Network Association Members Council Meeting, Copenhagen by Max KaiserEuropeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
Europeana Network Association Members Council Meeting, Copenhagen by Max Kaiser
 
Eun lre brussels_winer20100616
Eun lre brussels_winer20100616Eun lre brussels_winer20100616
Eun lre brussels_winer20100616
 
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
Alastair Dunning, Europeana Cloud: The Project and the Challenges of Assessin...
 
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...
 
The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012
 
The Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final EventThe Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final Event
 
Europeana datainaction nov2012
Europeana datainaction nov2012Europeana datainaction nov2012
Europeana datainaction nov2012
 
LDBC 19 November 2013
LDBC 19 November 2013  LDBC 19 November 2013
LDBC 19 November 2013
 
co:op-READ-Convention Marburg - Günter Mühlberger
co:op-READ-Convention Marburg - Günter Mühlbergerco:op-READ-Convention Marburg - Günter Mühlberger
co:op-READ-Convention Marburg - Günter Mühlberger
 
Ariadne Booklet: The Way Forward to Digital Archaeology in Europe
Ariadne Booklet: The Way Forward to Digital Archaeology in EuropeAriadne Booklet: The Way Forward to Digital Archaeology in Europe
Ariadne Booklet: The Way Forward to Digital Archaeology in Europe
 
The European(a) Newspapers Project
The European(a) Newspapers ProjectThe European(a) Newspapers Project
The European(a) Newspapers Project
 
eBooks & more
eBooks & moreeBooks & more
eBooks & more
 

Similaire à Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink

Europeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) caseEuropeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) caseAntoine Isaac
 
Sharing Cultural Heritage Online with LoCloud: workshop
Sharing Cultural Heritage Online with LoCloud: workshopSharing Cultural Heritage Online with LoCloud: workshop
Sharing Cultural Heritage Online with LoCloud: workshoplocloud
 
When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...Valentine Charles
 
Copyright challenges and policy choices in European heritage projects Tools, ...
Copyright challenges and policy choices in European heritage projects Tools, ...Copyright challenges and policy choices in European heritage projects Tools, ...
Copyright challenges and policy choices in European heritage projects Tools, ...Phonothèque MMSH
 
Building an ecosystem of networked references
Building an ecosystem of networked referencesBuilding an ecosystem of networked references
Building an ecosystem of networked referencesHugo Manguinhas
 
Multilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at EuropeanaMultilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at EuropeanaAntoine Isaac
 
Data scale and diversity issues at Europeana
Data scale and diversity issues at EuropeanaData scale and diversity issues at Europeana
Data scale and diversity issues at EuropeanaAntoine Isaac
 
Europeana creative sound archives and social networks @ British and Irish S...
Europeana creative  sound archives and social networks  @ British and Irish S...Europeana creative  sound archives and social networks  @ British and Irish S...
Europeana creative sound archives and social networks @ British and Irish S...Lizzy Komen
 
Introduction to LoCloud
Introduction to LoCloudIntroduction to LoCloud
Introduction to LoCloudlocloud
 
G00 holy locloud_introduction
G00 holy locloud_introductionG00 holy locloud_introduction
G00 holy locloud_introductionevaminerva
 
G00 holy locloud_introduction
G00 holy locloud_introductionG00 holy locloud_introduction
G00 holy locloud_introductionevaminerva
 
Data modelling at Europeana and DM2E - SMW13
Data modelling at Europeana and DM2E - SMW13Data modelling at Europeana and DM2E - SMW13
Data modelling at Europeana and DM2E - SMW13Antoine Isaac
 
Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...Valentine Charles
 
Linked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approachLinked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approachValentine Charles
 
Digital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British LibraryDigital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British LibraryNora McGregor
 

Similaire à Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink (20)

Europeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) caseEuropeana as a Linked Data (Quality) case
Europeana as a Linked Data (Quality) case
 
Sharing Cultural Heritage Online with LoCloud: workshop
Sharing Cultural Heritage Online with LoCloud: workshopSharing Cultural Heritage Online with LoCloud: workshop
Sharing Cultural Heritage Online with LoCloud: workshop
 
When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...
 
Copyright challenges and policy choices in European heritage projects Tools, ...
Copyright challenges and policy choices in European heritage projects Tools, ...Copyright challenges and policy choices in European heritage projects Tools, ...
Copyright challenges and policy choices in European heritage projects Tools, ...
 
Europeana and Researchers
Europeana and ResearchersEuropeana and Researchers
Europeana and Researchers
 
Museums and Europeana
Museums and EuropeanaMuseums and Europeana
Museums and Europeana
 
Building an ecosystem of networked references
Building an ecosystem of networked referencesBuilding an ecosystem of networked references
Building an ecosystem of networked references
 
Multilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at EuropeanaMultilingual challenges and ongoing work to tackle them at Europeana
Multilingual challenges and ongoing work to tackle them at Europeana
 
Digital musicology by Amelie Roper
Digital musicology by Amelie Roper Digital musicology by Amelie Roper
Digital musicology by Amelie Roper
 
Data scale and diversity issues at Europeana
Data scale and diversity issues at EuropeanaData scale and diversity issues at Europeana
Data scale and diversity issues at Europeana
 
Europeana creative sound archives and social networks @ British and Irish S...
Europeana creative  sound archives and social networks  @ British and Irish S...Europeana creative  sound archives and social networks  @ British and Irish S...
Europeana creative sound archives and social networks @ British and Irish S...
 
Itapa2010 custodea
Itapa2010 custodeaItapa2010 custodea
Itapa2010 custodea
 
Introduction to LoCloud
Introduction to LoCloudIntroduction to LoCloud
Introduction to LoCloud
 
G00 holy locloud_introduction
G00 holy locloud_introductionG00 holy locloud_introduction
G00 holy locloud_introduction
 
G00 holy locloud_introduction
G00 holy locloud_introductionG00 holy locloud_introduction
G00 holy locloud_introduction
 
Data modelling at Europeana and DM2E - SMW13
Data modelling at Europeana and DM2E - SMW13Data modelling at Europeana and DM2E - SMW13
Data modelling at Europeana and DM2E - SMW13
 
Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...
 
Linked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approachLinked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approach
 
Digital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British LibraryDigital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British Library
 
Digital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British LibraryDigital Cultural Heritage: Experiences from British Library
Digital Cultural Heritage: Experiences from British Library
 

Dernier

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 

Dernier (20)

Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 

Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink

  • 1. Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink Hugo Manguinhas, Valentine Charles, Antoine Isaac| Europeana Foundation Tom Miles| The British Library Aude Lima| The Centre de Recherche en Ethnomusicologie Ariane Néroulidis, Véronique Ginouvès| The Maison Méditerranéenne des Sciences de l'Homme Dimitra Atsidis, Maarten Brinkerink| Netherlands Institute for Sound and Vision Michiel Hildebrand| Spinque B.V. Sergiu Gordea| Austrian Institute of Technology
  • 2. What is Europeana? NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA We aggregate metadata: • From all EU countries • ~3,500 galleries, libraries, archives and museums • More than 52M objects • In about 50 languages • Huge amount of references to places, agents, concepts, time Europeana aggregation infrastructure Europeana| CC BY-SA The Platform for Europe’s Digital Cultural Heritage
  • 3. The Europeana Sounds project Europeana Sounds aims to increase the amount of audio content available via Europeana • also improving geographical and thematic coverage Apart from aggregation, it improves discovery and use of audio content, by enriching metadata through innovative methods NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 4. The scope of the experiment • Evaluate the use of a semi-automatic tool like CultuurLink for a concrete vocabulary alignment case, and • Assess the coverage of the MIMO vocabulary for enriching Europeana Sounds datasets NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 5. About the MIMO vocabulary NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA • A multilingual controlled vocabulary of musical instruments • Developed within the Musical Instruments Museums Online project which gathered some of Europe's most important musical instruments museums
  • 6. Why MIMO? • A significant part of the subjects present in Europeana Sounds collections refer to musical instruments • Good coverage • gathers a total of 3121 musical instruments used by professionals such as Hornbostel-Sachs (641) • contains terms in 8 different languages (English, French, Polish, Catalan, Dutch, Italian, Swedish, German) • Technically available on the Web • Follows the Linked Data best practices and recipes (RDF, SKOS, content negotiation) • Openly available (CCO) NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 7. Overview of MIMO language coverage NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 8. What is CultuurLink? • Semi-automatic Vocabulary Alignment Tool • Successor to EuropeanaConnect's Amalgame NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA http://cultuurlink.beeldengeluid.nl
  • 9. Why CultuurLink? • Freely available • as an online open service that any user can use • Users have the ability to design and experiment with different alignment strategies • helps the task of discovering new alignments between two vocabularies • users can define and combine strategies that apply different techniques or parameterizations • Manual control • alignments are identified through an automatic means but strategies are designed by users • users can decide which alignments are correct and can assign a specific meaning (e.g. skos:exactMatch, skos:related, skos:broadMatch) • User friendly • allows non-technical savvy users to easily perform fairly complex tasks NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 10. The participants and their collections • The British Library (BL) participated with 3 collections: • A selection of Asian instruments (1,099 records) from the "Colin Huehns Asia Collection" • a selection from the “Peter Cooke Uganda Collection” (1,312 records) • and the “Keith Summers English Folk Music Collection” (1,326 records) • The Centre de Recherche en Ethnomusicologie (CREM) • participated with a test collection of 36 records published in the CD “Musical Instruments of the World” • The Maison Méditerranéenne des Sciences de l'Homme (MMSH) • participated with a collection of 25 records about folk music • The Netherlands Institute of Sound and Vision (NISV) • participated with a collection of 6,608 records containing commercial 78 rpm records (Handelsplaten) from different genres like light music, classical music and opera. NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 11. Alignment of vocabulary terms We decided to focus on the vocabulary terms within the subject fields of the metadata as opposed to aligning the full vocabulary used by the providing institution, because: • not available for use outside the organization and/or in a data structure that suits a vocabulary alignment tool • we preferred to report on alignments for the subjects used in the source datasets and not on all possible subjects NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 12. What have we done? For each collection we: • Extracted a SKOS vocabulary out of the subject terms found in the object metadata • Set-up a permanent session on CultuurLink • Asked providers to perform the alignments • Collected and assessed the alignments and feedback obtained from the Data Providers NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 13. Concept definition obtained from the MMSH dataset NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA <skos:ConceptScheme rdf:about="http://www.europeanasounds.eu/data/mmsh/concepts#ConceptScheme"> </skos:ConceptScheme> <skos:Concept rdf:ID="grelot"> <skos:inScheme rdf:resource="#ConceptScheme"/> <skos:prefLabel>grelot</skos:prefLabel> <skos:note rdf:resource="http://mint- projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma- num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9800"/> <skos:note rdf:resource="http://mint- projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma- num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9775"/> <skos:note rdf:resource="http://mint- projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma- num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9801"/> <skos:note rdf:resource="http://mint- projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma- num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9768"/> <skos:note rdf:resource="http://mint- projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma- num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9798"/> <skos:note rdf:resource="http://mint- projects.image.ntua.gr/data/sounds/http://phonotheque.mmsh.huma- num.fr/dyn/portal/index.seam?page=alo&amp;aloId=9788"/> </skos:Concept> URIs of the records are kept as skos:notes Text found in dc:subject
  • 14. The alignments obtained from CultuurLink NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:skos="http://www.w3.org/2004/02/skos/core#" > <rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#guitare"> <skos:exactMatch rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/3237"/> <owl:differentFrom rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/5137"/> </rdf:Description> <rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#flûte"> <skos:exactMatch rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/3955"/> </rdf:Description> <rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#grelot"> <skos:exactMatch rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/2873"/> </rdf:Description> <rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#ban"> <owl:differentFrom rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/2498"/> </rdf:Description> <rdf:Description rdf:about="http://www.europeanasounds.eu/data/concepts#violon"> <skos:exactMatch rdf:resource="http://www.mimo-db.eu/InstrumentsKeywords/3573"/> </rdf:Description> </rdf:RDF> MIMO concept Subject term Alignments identified by the data provider for this subject
  • 15. The quantitative results of the experiment NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 16. Findings identified when aligning with CultuurLink (1/2) • Applying an exact string matching of preferred labels is sufficient to align ~50% • Also incorrect alignments were identified due to polysemy reasons • e.g. “ban” or “zang” which means singing or song matching the instrument “zang”, a sort of cymbals or clapper bells • Applying match against labels in any language turned out be very successful on finding matches based on vernacular terms • But also increased the number of irrelevant alignments NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 17. Findings identified when aligning with CultuurLink (2/2) More elaborate strategies were found very helpful to discover more alignments: • by using a less restrictive string matching function like “contains” or “startsWith”, to surface “broader” or “narrower” relations • by activating stemming • e.g. “Trompet” was aligned with “Trompetten” and “Accordeon” with “Accordeons”, both in Dutch • by applying fuzzy matching both with max distance of 1 and 2 • The “NOT A” functionality was found crucial to iteratively refine the strategy Using such strategies also revealed some quality issues in the source metadata, such as: • misspellings and unrecognizable/doubtful terms NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA
  • 18. What about MIMO? NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA In general the Data Providers found MIMO: • Good coverage of musical instruments • Good language coverage comparing to their local vocabulary • Simplified hierarchy allowing it to be understandable and practical for non musicologists • Includes updated families treating both electronic instruments and tools that are presented in contemporary music • Helpful concept definitions However, • Lacks concepts to describe voice (texture, mechanism, etc.) • But may be enriched by the DOREMUS project with vocal terms from the IAML mediums of performance thesaurus • Centred on occidental classical music structure
  • 19. Quick demo http://cultuurlink.beeldengeluid.nl NKOS16 - Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary using CultuurLink CC BY-SA