Slides for the MTSR2018 presentation for the paper The Benefits of Linking Metadata for Internal and
External users of an Audiovisual Archive by Victor de Boer, Tim de Bruyn, John Brooks and Jesse de Vos
Like other heritage institutions, audiovisual archives adopt structured vocabularies for their metadata management. With Semantic Web and Linked Data now becoming more and more stable and commonplace technologies, organizations are looking now at linking these vocabularies to external sources, for example those of Wikidata, DBPedia or GeoNames. However, the benefits of such endeavors to the organizations are generally underexplored. In this paper, we present an in-depth case study into the benefits of linking the “Common Thesaurus for Audiovisual Archives” (or GTAA) and the general-purpose dataset Wikidata. We do this by identifying various use cases for user groups that are both internal as well as external to the organization. We describe the use cases and various proofs-of-concept prototypes that address these use cases.
The Benefits of Linking Metadata for Internal and External users of an Audiovisual Archive
1. The Benefits of Linking
Metadata for Internal
and External users of
an Audiovisual Archive
Victor de Boer,
Tim de Bruyn,
John Brooks,
Jesse de Vos
With content from: M. Brinkerink, J. Oomen
5. Research & Development at NISV
R&D initiates, stimulates and facilitates research and
development. At the same time it also collects knowledge and
practical examples from inside and outside the institute and
offers this to knowledge organisations, colleagues in the sector
and other interested parties.
6. 11 Petabyte of data
65,000 hrs of yearly ingest
Digitization of old material
Digital-born new content
9. Linked Data
"Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net
12. Machine readable format
Standardized
Flexibility to connect heterogeneous data
Link what can be linked
re-use and re-usability
OBJECT EVENT
PLACE
TIME
PERSON
CONCEPT
PROVENANCE
Why Linked Open Data
13. The Benefits of Linking
Metadata for Internal
and External users of
an Audiovisual Archive
Victor de Boer,
Tim de Bruyn,
John Brooks,
Jesse de Vos
15. ALIGNMENTVRT thesaurus GTAA
Example: Linking Dutch and Flemish collections
de Boer et al. Exploring Audiovisual Archives through Aligned Thesauri. Proceedings of MTSR2016
16. • “Collaboratively
edited knowledge base”
• Drives ‘facts’ in Wikipedia
• 50 Million items 500M
statements (=triples)
• Wikidata query service
https://query.wikidata.org/
WikiData: General-purpose Knowledge Graph
18. • 10,350 GTAA persons
matched at time of
writing. Currently 45.000
• Based on labels and
contextual information
– skos:scopeNote, other
https://www.wikidata.org/wiki/Q37079
Mix ‘n’ Match results
19. Analysis of the data
Interviews
Describing use cases
Partial implementation of cases
Validate with interviewees
Improve
Identifying value: Method
21. Internal and External use cases
Internal: Tim de Bruyn
Interviews with NISV employees
1) intake,
2) information management
3) research and development
External: John Brooks
Interviews with Media Scholars
Internal and External use cases
22. UC-I-1: Receiving an alert when the
copyright on a person's work expires.
• Dutch copyright expiration
laws
• Using GTAA and WikiData
“Date of death”
• WikiData occupations
(~800)
• Manual selection of
relevant occupations
• (Fields of television,
movies, radio, theater
and music
• Google Calendar alert
23. UC-I-2: Provide more information on
a person appearing in online story
• NISV story platform
• more information on a
person in story
• Automatic generated
description
24. UC-I-3: Using Wikidata for story
recommendation.
• Stories you might also like
• Manually selected properties
for semantic
recommendation
• Using properties as metadata
• Threshold on matching
properties
25. UC-E-1: Exploratory extension of the
CLARIAH Media Suite
• http://mediasuite.clariah.nl/
• Media Scholars
• Combines datasets and analysis tools in an integrated workspace
26. UC-E-1: Wikidata retrieval service
• Based on interviews with five
users of the Media Suite,
focusing on 1) Drugs, 2) Sports,
3) Occupations, 4) History and
5) Disruptive media events
• Exploratory search by
properties
• Send SPARQL query to Wikidata
Query Service
• Retrieve list of persons based
on properties
• View additional information
(Wikidata/GTAA)
• Exploratory search
29. ● Interviewees
● Four tasks,
○ Sports, politics, disruptive media events
● Share feedback
○ Discuss limitations
○ Propose improvements
● Added value for exploratory search
● Provides insight into background knowledge
● Participants report feeling grasping the context
● Data (in)completeness is a major issue
UC-E-1: Validation
30. Take home
Connecting archives to background information
using Linked Databrings new possibilities for
access, analysis of content
WikiData is becoming de-facto standard for generic
background knowledge
Shift from tech push to user needs
We show added value for a variety of users
Data completeness and Quality are (and will
remain) key