1. LINKED DATA AND THE
LOCAH PROJECT
Bethan Ruddock, Library and Archival
Services, Mimas, University of Manchester
bethan.ruddock@manchester.ac.uk @ bethanar
#ILI2011
2. LINKED OPEN COPAC & ARCHIVES HUB
JISC-funded project (under JISCexpo - exposing
digital content for education and research)
September 2010 – August 2011
Staff from Mimas, UKOLN, Eduserv
Additional expertise from Talis, OCLC, Library of
Congress
3. PROJECT AIMS
Put archival and bibliographic data at the heart of the Linked Data
Web, making new links between diverse content sources, enabling the free
and flexible exploration of data and enabling researchers to make new
connections between subjects, people, organisations and places to reveal
more about our history and society.
Make a collection of resources available on the Web as structured data, in
particular linked data, where a case can be made that it would benefit
teaching, learning, research, administration and/or knowledge transfer in
UK higher education
Develop a prototype with instructional step-by-step demonstration and
documentation to show how the structured content can be used by 3rd
party tools and services
Explore and report on the opportunities and barriers in making content
structured and exposed on the Web for discovery and use. Such opportunities
and barriers may coalesce around licensing
implications, trust, provenance, sustainability and usability
4.
5. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
6. THE DATA: COPAC
• Merged union catalogue of the holdings of over 60
UK libraries
• Over 50 million records
• Consolidated records
• MODS XML (not MARC)
A Copac consolidated record
created from 5 contributed
records. Lines show how
contributed records match with
one another.
7. THE DATA: ARCHIVES HUB
• Descriptions of archive collections from over 200
UK repositories
• Nearly 25,000 descriptions – collection-level and
multi-level
• EAD (Encoded Archival Description)
8. CHALLENGES: VARIANCE
• Data from many sources – should adhere to
Standards
AARC2
ISAD(G) BUT
Differences in implementation
9. CHALLENGES: DATA
260 $b: unknown
dct:publisher:
unknown
dct publisher: definition:
‘entity responsible for making the resource available’
12. LICENSING
• Data comes from contributors
Not ours to redistribute!
• Concerns
Provenance
Trust
Control
• Consulted
Liaised with contributors and stakeholders
13. THE TECHY STUFF
Specifications required a lot of brainstorming…
Image used under a CC licence from http://www.flickr.com/photos/blankdots/4865831504/
14. ARCHIVES HUB MODEL
in
Finding maintainedBy/ Repository administeredB Place Postcode
Aid maintains (Agent) y/ Unit
administers
hasPart/ encodedAs/
partOf encodes EAD
Document
accessProvidedBy/
Level
Biographical hasBiogHist/ topic/
providesAccessTo
History isBiogHistFor page
level Language
Archival language at time
topic/
origination hasPart/ Resource
page
product of Creation Temporal
partOf
associatedWith
Entity
extent
inScheme
Extent
Agent Concept Concept
Scheme
representedBy
Is-a foaf:focus
Object
Is-a associatedWith
Person Family Organisation Place
Book
participates in
Birth Death Genre Function
at time
Temporal
Entity
16. Node name MODS field Ontology
BibliographicResource <modscollection> bibo
cardinality property URI/literal ontology
0 1 copac:creator Creator URI dc
0 m copac:contributor Contributor URI coapc
0 1 event:producedIn Production Date URI event
0 1 dct:issued Production Date URI dc
0 m pode:publicationPlace Place URI pode
0 m isbd:P1016 Place URI isbd
0 m dct:publisher Publisher URI dc
0 1 dct:isPartOf Series URI dc
1 m copac:HeldBy Institution URI with Institution as subject
1 1 bibo:type Type URI bibo
0 m dct:subject Subject URI dc
0 m skos:subject subject URI skos
0 m dct:language Language URI dc
1 1 hub:encodedAs mods URI hub
18. Visualisation Prototype
Using Timemap –
Googlemaps and
Simile
http://code.google.com/p/time
map/
Early stages with this
Will give location and
‘extent’ of archive.
Will link through to
Archives Hub
24. WHAT NEXT?
Linking Lives
name-based approach into the data
integrating archival resource with other
resources
DBPedia, VIAF, Copac...
route into archives for different audiences?
issues around trust and provenance to be
explored
25.
26. FINALLY…
The LOCAH data is open for use…
…please play with it!
Image used under a CC licence from http://www.flickr.com/photos/huladancer22/530743543/
27. @bethanar LOCAH blog: http://blogs.ukoln.ac.uk/locah/
bethaninfoprof.wordpress.com
bethan.ruddock@manchester.ac.uk
Image used under a CC licence from http://www.flickr.com/photos/theilluminated/5386099858/