1. Publishing and Using
Linked Open Data
Richard J. Urban, Ph.D.
School of Library and Information Studies
Florida State University
rurban@fsu.edu
@musebrarian
#lod4h
2. January 7, 2013
Monday’s Schedule
• 9:30 - 10:00 Class Session: Participant Introductions
• 10:00- 10:45 Class Session: A Gentle Introduction to Linked Data
• 10:45-11:00 am Break
• 11:00 am- Noon Class Session: Exploring Linked Data Use Cases
• Noon- 1 pm Lunch (on your own)
• 1:00-2:30 pm Class Session: A Gentle Introduction to Linked Data (con't)
• 2:30-2:45 pm Break
• 2:45-3:45pm Class Session: Participant Project Kick-off
• 4:00-5:00 pm Lecture: Seb Chan - Location: Ulrich Recital Hall, Tawes Fine Arts Building
• 5:30 pm-7:00 pm Graduate Student Networking Event
Hosted by CUNY and MITH
Location: MITH
0301 Hornbake Library (inside Non-Print Media)
Refreshments Provided
#lod4h
8. Document semantics
• XML (and HTML) provides a descriptive
markup for documents (including metadata
records)
• Even for more complex XML, like TEI, the
meaning of many elements is dependent
on it’s context within a document instance.
• Interpreting this context requires human
intervention.
#lod4h
9. Organizing the Web
• Human organization
• Crawl and Index
– Uses many of the methods used by digital
humanities scholars to extract information
from web documents.
• Page Rank
– Inferring importance from links
#lod4h
13. Data Semantics
• Often dependent on human interpretation
of documents/standards.
• Local data-provider interpretations not
always documented or available to data
consumers.
#lod4h
15. Linked Data Principles
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up
those names.
3. When someone looks up a URI, provide useful
information, using the standards (RDF,
SPARQL)
4. Include links to other URIs. so that they can
discover more things.
#lod4h 15
16. “Things” = Resources
A resource can be anything that has identity.
Familiar examples include an electronic document, an image, a service
(e.g., "today's weather report for Los Angeles"), and a collection of
other resources. Not all resources are network "retrievable"; e.g.,
human beings, corporations, and bound books in a library can also be
considered resources.
The resource is the conceptual mapping to an entity or set of entities,
not necessarily the entity which corresponds to that mapping at any
particular instance in time. Thus, a resource can remain constant even
when its content---the entities to which it currently corresponds---
changes over time, provided that the conceptual mapping is not
changed in the process.
http://www.ietf.org/rfc/rfc2396.txt
#lod4h
17. Uniform Resource Identifiers
• More than a
Uniform Resource
Locator (URL)
• Proves a
mechanism to
name resources in
a way that works at
Internet scale.
http://en.wikipedia.org/wiki/Uniform_resource_identifier
#lod4h
18. De-referencing URIs
• When someone looks up a URI, provide useful
information, using the standards
(RDF, SPARQL)
• URIs can be used to name non-networked
resources (concepts, people, physical
objects, etc.)
• Useful if information about these objects can be
returned when the name is used.
• CoolURIs for the Semantic Web
http://www.w3.org/TR/cooluris/
#lod4h
19. Resource Description Framework
• A model for representing data
– An artificial language with a formal semantic
model
– Can be expressed using multiple syntaxes
– Simple grammar
• RDF “Triple”
– <subject> <predicate> <object>
– NAME verb Object
– Mona Lisa painted by Leonardo da Vinci
#lod4h
20. It’s a graph!
• That uses URIs
http://ex.org/monaLisa#
http://purl.org/dc/terms/creator/
http://ex.org/daVinci#
#lod4h 20
21. From a simple language, we can say complex things.
#lod4h 21
22. RDF Data Modeling
• RDF can be used with multiple tools for
modeling data
• Simple: RDF Schema (RDFS)
• Robust: Web Ontology Language (OWL)
– OWL-Lite
– OWL-Full
#lod4h
23. Limitations
• Best used for simple declarative
statements
– Difficult to express meta-assertions
i.e. “john believes that sally is 5’ tall”
– Data provenance/trust
– Negation “sally is not 5’ tall”
– Tenseless (need to explicitly model time)
– Modeling a “record” (named graphs)
#lod4h
24. SPARQL
• SPARQL Protocol and RDF Query
Language
– A query language for RDF
– Similar to SQL
– Implemented by RDF publication software
(Triplestore)
#lod4h
25. Link to Other Resources
• Include links to other URIs. so that they can
discover more things.
– Link to controlled vocabularies/ontologies
– Use existing RDFS/OWL schemas
– link different representations of the same
resources together
• Associate annotations with resources
#lod4h
27. Linked Open Data Criteria
★ Available on the web (whatever format), but
with an open license
★★ Available as machine-readable structured data
(e.g. excel instead of image scan of a table)
★★★ as (2) plus non-proprietary format (e.g. CSV
instead of excel)
★★★★ All the above plus, Use open standards from
W3C (RDF and SPARQL) to identify things, so
that people can point at your stuff
★★★★★ All the above, plus: Link your data to other
people’s data to provide context
#lod4h
31. A Simple Start
• Friend of a Friend (FOAF)
http://www.foaf-project.org/
• A simple RDF vocabulary for describing
people and their relationships.
#lod4h
36. Basic Turtle
• Terse RDF Triple Language
http://www.w3.org/TeamSubmission/turtle/
• Always start with a @prefix to declare a
namespace for each schema you will use
in your graph
– Can mix/match any published RDF schema
@prefix : <http://xmlns.com/foaf/0.1/> .
#lod4h
37. FOAF Properties
http://xmlns.com/foaf/spec/
• FOAF Core • Social Web – publications
– Agent – nick
– Person – currentProject
– mbox
– name – pastProject
– title – homepage
– img – weblog – account
– depiction (depicts) – openid – OnlineAccount
– familyName
– jabberID – accountName
– givenName
– knows – mbox_sha1sum – accountServiceHomepage
– based_near – interest
– PersonalProfileDocument
– age – topic_interest
– made (maker) – topic (page) – tipjar
– primaryTopic (primaryTopicOf)
– Project
– workplaceHomepage – sha1
– Organization – workInfoHomepage – thumbnail
– Group – schoolHomepage – logo
– member
– Document
– Image
#lod4h
38. Get Yourself a URI
• Can use a CoolURI based on your
homepage
• A mailto:email@org.org
• A “blank node”
_:me
(although these are discouraged for
Linked Data)
#lod4h
39. @prefix : <http://xmlns.com/foaf/0.1/> .
<http://chi.cci.fsu.edu/person/rurban#> URIs are always
enclosed in brackets
:name “Richard Urban” ;
Statements end with
a semi-colon…..
Properties start with
a colon. Strings are in quotes. Except the last
statement ends in a
period.
:homepage <http://chi.cci.fsu.edu> .
#lod4h
40. Hands-on
• Open a text editor.
• Write a FOAF description for yourself
using the Turtle Syntax.
– http://xmlns.com/foaf/spec/
– http://www.w3.org/TeamSubmission/turtle/
• Save the file with .ttl extension
– yourName.ttl
#lod4h
41. Publishing Your FOAF
• Put the file online, link it from your website.
• Publish using an RDF Triplestore
• Using FOAF-based plugins for
Wordpress/Drupal, etc.
#lod4h
42. Sesame Triple Store
• Let’s use my sandbox for this week:
– http://goo.gl/PgdqN
• Select the DHWI repository
• Select ADD
• Context baseURL: http://chi.cci.fsu/dhwi
• Past your Turtle into the RDF box.
• All of us together:
#lod4h
43. Linking our FOAF together.
• I know we just met, and this is crazy, but…
:knows <http://chi.cci.fsu.edu/person/rurban#>
• Add the URI of anyone else in the class
you know.
#lod4h
44. Some FOAF Humanities Use Cases
• Virtual International Authority File
http://www.viaf.org
• Social Networks and Archival Context
http://socialarchive.iath.virginia.edu/
• Linking Lives
http://data.archiveshub.ac.uk/page/person/nc
arules/skinnerbeverley1938-1999artist
• dbPedia
http://dbpedia.org/data/Abraham_Lincoln.n3
#lod4h
45. Beyond FOAF
• Organization Ontology
http://www.w3.org/TR/vocab-org/
• Encoded Archival Context-
Corporate, Personas, Families Ontology
http://goo.gl/oFIkW
• Other domain ontologies with
representations of people.
#lod4h
47. Participant Projects
• What’s a small linked data project you can
complete in the next few days?
– Explore modeling questions
• Identify existing models
– Create/transform some data
• What data is already out there?
– Publish some examples
– Explore potential applications
#lod4h
48. Tonight’s Events
• 4:00-5:00pm Lecture: Seb Chan
– Location: Ulrich Recital Hall in Tawes Fine
Arts Building
• 5:30pm-7:00pm Graduate Student
Networking Event
– Hosted by CUNY and MITH; Location: MITH,
0301 Hornbake Library inside Non-Print
Media
– Refreshments Provided
#lod4h