Adventures in Linked Data Land (presentation by Richard Light)

Adventures in Linked Data Land:
bringing RDF to the Wordsworth
Trust

Richard Light

CT Linked Data Meeting, 22 February 2010

Discovering Linked Data
Four principles of Linked Data (Tim B-L):

● Use URIs to identify resources

● Use HTTP URIs so that people can look them up

● Provide useful information about the resource

● Include links to other URIs in your data

Discovering dbPedia
● Extraction of Linked Data from Wikipedia
● Statements in info boxes (mainly) become RDF
triples:
<rdf:Description
rdf:about="http://dbpedia.org/resource/Ber
lin_Marathon">
<dbpprop:location
rdf:resource="http://dbpedia.org/resource/
Berlin"/>
</rdf:Description>

Note the URLs

Browsing Linked Data

● View RDF as a web page:
http://dbpedia.org/page/Berlin

● Navigate from one data source to another

● Specialist Linked Data browsers/plugins:
– DISCO
– Marbles
– Openlink Data Explorer
– Tabulator

Querying Linked Data

● SPARQL query language:
http://www.w3.org/TR/2008/REC-rdf-sparql-query-
20080115/

● And SPARQL XML results format:
http://www.w3.org/TR/rdf-sparql-XMLres/

● “SPARQL end-points”:
http://dbpedia.org/sparql
http://dbtune.org/bbc/peel/sparql
http://data.linkedmdb.org/sparql

Asking interesting questions

● German musicians born in Berlin:
●

So what do we have here?

● An initiative to generate lots of Linked Data

● A Linked Data Cloud, containing a growing
number of RDF datasets

● A hard-to-use query language capable of very
precise and powerful querying

Where do museums come into this picture?

The Wordsworth Trust

● Typical museum collection: about 60,000 objects

● Major collection of manuscripts (notebooks,
letters, etc.)

● Objects published to the Web from a ModesXML
database

● Unwise enough to allow me Remote Desktop
access ...

Typical collections object

GRMDC.C104.2

Same object represented as RDF

Same object represented as XTM

One identifier; three “views”
● This object has a single persistent identifier:
http://collections.wordsworth.org.uk/object/GRMDC.C104.2

● This maps to different views depending on the
“Accept” header in the HTTP request:

– application/rdf+xml >> RDF
– application/xtm+xml >> XTM Topic Map
– Otherwise >> HTML (human-readable)

● Achieved through a custom 404 “page not found”
handler

“Page not found” handler (1)

● All URLs are fictitious, so they generate a 404

● Modified a generic smart 404 handler from:
http://evolvedcode.net/content/code_smart404/

● Added support for “303 See other” redirects

● added wild card matching to re-format URLs


● Generic URL, plus requested Accept format,
determine initial “303 See other” mapping, e.g.:
http://collections.wordsworth.org.uk/object/GRMDC.C104.2
+
Accept: application/rdf+xml
=
http://collections.wordsworth.org.uk/object/rdf/GRMDC.C104.2

● When this is passed back in, the 404 handler has to
generate the required RDF directly

● Can't just keep redirecting requests!


● Redirect rules declare mappings:


● Generic URL plus a supported Accept type
generates a “303 See other” redirect

● If it comes back as a page request, it is further
redirected with a “301 Moved permanently” to the
object's web page

● If it comes back as an RDF or XTM request, the
record is fetched as XML and subjected to an
XSLT transform by the handler

Implementation details

● HTML needed a “back link” to RDF to keep
OpenLink Explorer happy:
<link rel="alternate" type="application/rdf+xml"
href="http://collections.wordsworth.org.uk/object/data/GRMDC
.C104.2" title="RDF" />

● Result is totally unfindable: need a search or
harvesting mechanism:
– OAI support (possible)
– SPARQL end-point (harder)

What has been learnt? (1)
● The Linked Data paradigm encourages simple
RDF triples: no “blank nodes”

● For an object, this becomes a simple metadata set,
very analogous to the PNDS DCAP format

● The properties involved need to encapsulate the
whole relation between object and data, e.g.
<p:title>Ulswater from Pooley Bridge</p:title>
<p:technique>drawn</p:technique>
<p:maker>Farington, Joseph (1747-1821)</p:maker>
<p:technique>engraved</p:technique>
<p:maker>Middiman, Samuel (1750-1831)</p:maker>

What has been learnt? (2)

● Data in linked resources can “add value” to your
own recording efforts (e.g. place data)

Properties: which framework?

● I have used dbPedia properties (for compatibility
with other Linked Data resources … ?):
http://dbpedia.org/property/title
http://dbpedia.org/property/maker

● A viable alternative would be PNDS DCAP:
http://purl.org/dc/elements/1.1/title
http://purl.org/dc/elements/1.1/creator

● One framework which doesn't fit is the CIDOC
CRM:
E21 Physical Thing – E12 Production – E39 Actor = “creator”

Do we need “museum” properties?

● DbPedia properties are not coherent

● Need something richer than simple metadata

● Could use CIDOC CRM as basis

● Existing interchange formats such as LIDO could
be re-expressed in RDF

● Could broaden scope: “history” property set?

The problem of URIs

● Good Linked Data requires URIs everywhere

● Most of my museum RDF resolves to strings

● One exception is Geonames lookup:
Ullswater
becomes
http://www.geonames.org/2635191/

● In the absence of a central “people” registry,
should be minting URIs myself for people, etc.

Conclusions
● Implementing an RDF Linked Data front-end to a
museum database is feasible if:
– You can generate multiple outputs from your database
(XML is sufficient)
– You can implement a suitable URL rewriter or 404
handler

● It's easy (and a good idea) to mint and publish
URIs for your collection objects

● It's less clear where all the other URIs we'll need
will come from

Challenges for museum linked data

● Agreeing an ontology to enable cross-collection
[SPARQL] queries

● Shared URLs for in-common concepts: people,
places, events

● Mechanisms for getting URLs into museum data

● Getting existing authorities, e.g. AAT, to be
available as Linked Data

Ask Multimap where Lancaster is

Thank you!

Richard Light
richard@light.demon.co.uk

Adventures in Linked Data Land (presentation by Richard Light)

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (11)

Similaire à Adventures in Linked Data Land (presentation by Richard Light)

Similaire à Adventures in Linked Data Land (presentation by Richard Light) (20)

Dernier

Dernier (20)

Adventures in Linked Data Land (presentation by Richard Light)