DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
Linked Data on the Web
1. Linked Data on the Web
Olaf Hartig
http://olafhartig.de/foaf.rdf#olaf
Database and Information Systems Research Group
Humboldt-Universität zu Berlin
2. Outline
From a Web of Documents
to a Web of Data
Technical Foundations of Linked Data
Consuming Linked Data
Current Research Issues
Olaf Hartig - Linked Data on the Web
3. The Traditional Web
Traditional Web = Internet + Docs + Links
Olaf Hartig - Linked Data on the Web
4. The Traditional Web
Traditional Web = Internet + Docs + Links
● HTML as shared content format
● HTTP to access documents on the Web
● URLs
● Globally unique identifiers for documents
● Retrieval mechanism
● Hyperlinks
● Single global information space
Olaf Hartig - Linked Data on the Web
5. The Traditional Web
So what is the problem?
Olaf Hartig - Linked Data on the Web
6. The Traditional Web
So what is the problem?
● Web content is only loosely structured
● Difficult for applications to do smart things
Olaf Hartig - Linked Data on the Web
7. The Traditional Web
So what is the problem?
● Web content is only loosely structured
● Difficult for applications to do smart things
Solution:
● Increase the structure of Web content
● Publish data
Olaf Hartig - Linked Data on the Web
8. The Traditional Web
So what is the problem?
● Web content is only loosely structured
● Difficult for applications to do smart things
Solution:
● Increase the structure of Web content
● Publish data
But wait…
don't we do that already?
Olaf Hartig - Linked Data on the Web
9. The Traditional Web
● Content providers offer access via Web APIs
Web API
Web API Web API
Web API
Olaf Hartig - Linked Data on the Web
10. The Traditional Web
● Content providers offer access via Web APIs
● Mashups combine this data
Web API
Web API Web API
Web API
Olaf Hartig - Linked Data on the Web
11. The Traditional Web
● Content providers offer access via Web APIs
● Mashups combine this data
Shortcomings:
● APIs are proprietary
● Mashups are based on a fixed set of data sources
Web API
● YouWeb API
can not set hyperlinks between data object
Web API
Web API
Olaf Hartig - Linked Data on the Web
12. ● Use URIs as names for things
● Use HTTP URIs so that people
can look up those names.
● When someone looks up a
URI, provide useful
information.
● Include links to other URIs so
that they can discover more
things.
Tim Berners-Lee, July 2006
My Movie DB
Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
13. ● Use URIs as names for things
● Use HTTP URIs so that people
can look up those names.
● When someone looks up a
URI, provide useful
information.
● Include links to other URIs so
that they can discover more
things.
Tim Berners-Lee, July 2006
http://mymovie.db/movie1342
http://mymovie.db/movie0362
http://mymovie.db/movie5112
My Movie DB
http://mymovie.db/movie2449
Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
14. ● Use URIs as names for things
● Use HTTP URIs so that people
can look up those names.
http://m
● When someone looks up a
ymovie
URI, provide useful
information.
?
.d
b/movie
● Include links to other URIs so
that they can discover more
2449
things.
Tim Berners-Lee, July 2006
http://mymovie.db/movie1342
http://mymovie.db/movie0362
http://mymovie.db/movie5112
My Movie DB
http://mymovie.db/movie2449
Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
15. ● Use URIs as names for things
● Use HTTP URIs so that people
can look up those names.
http://m
● When someone looks up a
ymovie
URI, provide useful
information.
?
.d
b/movie
● Include links to other URIs so
that they can discover more
2449
things.
Tim Berners-Lee, July 2006
http://mymovie.db/movie1342
http://mymovie.db/movie0362
http://mymovie.db/movie5112
My Movie DB
http://mymovie.db/movie2449
Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
16. ● Use URIs as names for things
● Use HTTP URIs so that people
can look up those names.
http://m
● When someone looks up a
ymovie
URI, provide useful
information.
?
.d
b/movie
● Include links to other URIs so
that they can discover more
2449
things.
Tim Berners-Lee, July 2006
http://mymovie.db/movie1342
http://mymovie.db/movie0362
http://mymovie.db/movie5112
My Movie DB
http://mymovie.db/movie2449
Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
17. ● Use URIs as names for things
● Use HTTP URIs so that people
can look up those names.
http://m
● When someone looks up a
ymovie
URI, provide useful
information.
?
.d
b/movie
● Include links to other URIs so
that they can discover more
2449
things.
Tim Berners-Lee, July 2006
http://mymovie.db/movie1342
http://mymovie.db/movie0362
http://geo.db/country21
http://geo.db/country7
http://mymovie.db/movie5112
My Movie DB http://geo.db/cityCJ
http://geo.db/cityXA
http://mymovie.db/movie2449
Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
18. ● Use URIs as names for things
● Use HTTP URIs so that people
can look up those names.
http://m
● When someone looks up a
ymovie
URI, provide useful
information.
?
.d
b/movie
● Include links to other URIs so
that they can discover more
2449
things.
Tim Berners-Lee, July 2006
http://mymovie.db/movie1342
http://mymovie.db/movie0362
http://geo.db/country21
http://geo.db/country7
http://mymovie.db/movie5112
My Movie DB http://geo.db/cityCJ
http://geo.db/cityXA
http://mymovie.db/movie2449
Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
19. Linked Data – An Example
http://data.linkedmdb.org/.../2014
rdf:type http://data.linkedmdb.org/.../film
mov
ie:re
dc
fo
late
d Bo
af
:t
ok
itle
:b
as
ed
http://www4.wi … /0743424425
_n
The Shining
ea
r
http://sws.geonames.org/2635167/
Olaf Hartig - Linked Data on the Web
20. Linked Data – An Example
http://data.linkedmdb.org/.../2014
rdf:type http://data.linkedmdb.org/.../film
mov
ie:re
dc
fo
late
d Bo
af
:t
ok
itle
:b
as
ed
http://www4.wi … /0743424425
_n
The Shining
ea
r
http://sws.geonames.org/2635167/
n
atio
l
be
o pu l
n:p
:la
g
s
rdf
60943000
United Kingdom
Olaf Hartig - Linked Data on the Web
21. Linked Data – An Example
http://data.linkedmdb.org/.../2014
rdf:type http://data.linkedmdb.org/.../film
mov
ie:re
dc
fo
late
d Bo
af
:t
ok
itle
:b
as
ed
http://www4.wi … /0743424425 dc:
title
_n
The Shining
ea
r
http://sws.geonames.org/2635167/ The Shining
sko
s:s
n
atio
l
be
pu l
ub
o
n:p
:la
g
je
s
ct
rdf
60943000
United Kingdom
http://www4.wi … /Fiction
Olaf Hartig - Linked Data on the Web
22. Linked Data – An Example
http://data.linkedmdb.org/.../2014
rdf:type http://data.linkedmdb.org/.../film
mov
ie:re
dc
fo
late
d Bo
af
:t
ok
itle
:b
as
ed
http://www4.wi … /0743424425 dc:
title
_n
The Shining
ea
r
http://sws.geonames.org/2635167/ The Shining
sko
s:s
n
atio
l
be
pu l
ub
o
n:p
:la
g
je
s
ct
rdf
60943000
United Kingdom
http://www4.wi … /Fiction
http://www4.wi … /1571884029 t
skos:subjec
Olaf Hartig - Linked Data on the Web
23. Properties of Linked Data
● Anyone can publish data to the Web of data
● Entities are connected by links
● Giant global data graph that spans data sources
● Data is self-describing
● Vocabulary terms are identified by URIs, too
● Look-up yields their RDFS or OWL definition
● The Web of data is open
● Applications can discover new data sources at run-time
Olaf Hartig - Linked Data on the Web
24. Properties of Linked Data
● Anyone can publish data to the Web of data
● Entities are connected by links
● Giant global data graph that spans data sources
● Data is self-describing
● Vocabulary terms are identified by URIs, too
● Look-up yields their RDFS or OWL definition
● The Web of data is open
● Applications can discover new data sources at run-time
Is this real?
Olaf Hartig - Linked Data on the Web
25. W3C Linking Open Data Project
● Grassroots community effort
● Publish existing, open license datasets as Linked Data
● Interlink things between different data sources
Olaf Hartig - Linked Data on the Web
26. W3C Linking Open Data Project
As of July 2007
> 500M triples ca. 120,000 links
Olaf Hartig - Linked Data on the Web
27. W3C Linking Open Data Project
ca. 6.7B triples ca. 150M links
Olaf Hartig - Linked Data on the Web
28. W3C Linking Open Data Project
Media User generated
content Publications
Geographic
Cross-domain
Life Sciences
ca. 6.7B triples ca. 150M links
Olaf Hartig - Linked Data on the Web
29. Linked Data Publishers
● UK government
● US government
● Thomson Reuters (Open Calais)
● MetaWeb (Freebase)
● BBC
● NY Times
● Best Buy
● CNET
etc.
Olaf Hartig - Linked Data on the Web
30. Linked Data Publishers
● UK government
● US government
● Thomson Reuters (Open Calais)
● MetaWeb (Freebase)
● BBC
● NY Times
● Best Buy
● CNET
etc. Can I become part?
Olaf Hartig - Linked Data on the Web
31. Linked Data Publishing Tools
● Use HTTP URIs in your FOAF profile
● Legacy data in relational databases
● D2R Server, Triplify, Virtuoso, Ultrawrap, ...
● CMS
● Drupal
● Native RDF stores
● Sesame, AllegroGraph, Virtuoso
● Talis platform (Linked Data in the cloud)
● HTML with RDFa
Olaf Hartig - Linked Data on the Web
32. Integrating the Traditional Web
● Annotate Web documents with Linked Data URIs
http://data.semanticweb.org/ … /eswc/2007/paper-69
dc
:su
bje
ct
http://dbpedia.org/resource/Machine_Learning
● Annotation services using named entity recognition
● Open Calais (Thomson Reuters) for news
● Zemanta for blog posts
● Epiphany
Olaf Hartig - Linked Data on the Web
33. Outline
From a Web of Documents
to a Web of Data
Technical Foundations of Linked Data
Consuming Linked Data
Current Research Issues
Olaf Hartig - Linked Data on the Web
34. Technical Foundations
There is no magic – Linked Data is based
on well-established
(Semantic) Web technologies.
● HTTP
● URI
● RDF
● RDFS / OWL
Olaf Hartig - Linked Data on the Web
35. URIs
● Hash URIs
http://olafhartig.de/foaf.rdf#olaf
● Slash URIs
http://data.linkedmdb.org/resource/film/2014
Olaf Hartig - Linked Data on the Web
36. Looking up URIs
Give me data about
http://olafhartig.de/foaf.rdf#olaf
HTTP Request for http://olafhartig.de/foaf.rdf
GET /foaf.rdf HTTP/1.1
User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6 OpenSSL/0.9.8l zlib/1.2.3
Host: olafhartig.de
Accept: */*
Olaf Hartig - Linked Data on the Web
37. Looking up URIs
HTTP Response:
HTTP/1.1 200 OK
Date: Thu, 11 Mar 2010 08:47:53 GMT
Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.8g
Last-Modified: Fri, 05 Mar 2010 18:01:07 GMT
ETag: "72a16-1946-7fe53ec0"
Accept-Ranges: bytes
Content-Length: 6470
Content-Type: application/rdf+xml
Content-Language: de
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:foaf="http://xmlns.com/foaf/0.1/">
<foaf:PersonalProfileDocument rdf:about="">
<foaf:maker rdf:resource="http://olafhartig.de/foaf.rdf#olaf"/>
...
Olaf Hartig - Linked Data on the Web
38. HTTP Content Negotiation
● Request the resource in a specific format (representation)
● Use the HTTP header Accept to specify a media type
Example:
GET /data/dbprofs HTTP/1.1
Host: researchersmap.informatik.hu-berlin.de
Accept: text/rdf+n3
Olaf Hartig - Linked Data on the Web
39. HTTP Content Negotiation
HTTP Response:
HTTP/1.1 200 OK
Date: Thu, 11 Mar 2010 09:02:22 GMT
Server: Apache/2.2.13 (Linux/SUSE)
Content-Location: dbprofs.n3
Vary: negotiate,accept
TCN: choice
Last-Modified: Tue, 05 Jan 2010 14:46:17 GMT
ETag: "40e4d-2250-47c6be683f0e1;47c6be69482f5"
Accept-Ranges: bytes
Content-Length: 8784
Content-Type: text/rdf+n3
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix contact: <http://www.w3.org/2000/10/swap/pim/contact#> .
<> a foaf:Document ;
foaf:maker <http://www.informatik.hu-berlin.de/~hartig/foaf.rdf#olaf> .
...
Olaf Hartig - Linked Data on the Web
40. URIs
● Hash URIs
http://olafhartig.de/foaf.rdf#olaf
● Slash URIs
http://data.linkedmdb.org/resource/film/2014
Olaf Hartig - Linked Data on the Web
41. Redirections
HTTP Request for http://data.linkedmdb.org/resource/film/2014
GET /resource/film/2014 HTTP/1.1
User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6
Host: data.linkedmdb.org
Accept: application/rdf+xml
Olaf Hartig - Linked Data on the Web
42. Redirections
HTTP Request for http://data.linkedmdb.org/resource/film/2014
GET /resource/film/2014 HTTP/1.1
User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6
Host: data.linkedmdb.org
Accept: application/rdf+xml
Response:
HTTP/1.1 303 See Other
Date: Thu, 11 Mar 2010 08:15:50 GMT
Server: Jetty(6.1.4)
Location: http://data.linkedmdb.org/data/film/2014
Content-Length: 0
Via: 1.1 data.linkedmdb.org
Content-Type: text/plain
Olaf Hartig - Linked Data on the Web
43. Redirections
HTTP Request for http://data.linkedmdb.org/resource/film/2014
GET /resource/film/2014 HTTP/1.1
User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6
Host: data.linkedmdb.org
Accept: text/html
Olaf Hartig - Linked Data on the Web
44. Redirections
HTTP Request for http://data.linkedmdb.org/resource/film/2014
GET /resource/film/2014 HTTP/1.1
User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6
Host: data.linkedmdb.org
Accept: text/html
Response:
HTTP/1.1 303 See Other
Date: Thu, 11 Mar 2010 08:15:50 GMT
Server: Jetty(6.1.4)
Location: http://data.linkedmdb.org/page/film/2014
Content-Length: 0
Via: 1.1 data.linkedmdb.org
Content-Type: text/plain
Olaf Hartig - Linked Data on the Web
45. Vocabularies and Ontologies
● Defined using RDFS or OWL
● A plenty of vocabularies exist:
● People
● Social media
● Commerce
● Events
● Radio and TV programmes
● Music
etc.
Olaf Hartig - Linked Data on the Web
46. owl:sameAs
http://sws.geonames.org/2635167/
=
http://dbpedia.org/resource/United_Kingdom
=
http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000003e30b
=
http://www4.wiwiss.fu-berlin.de/factbook/resource/United_Kingdom
=
http://www4.wiwiss.fu-berlin.de/eurostat/resource/countries/United_Kingdom
Olaf Hartig - Linked Data on the Web
47. owl:sameAs
http://data.linkedmdb.org/.../2014
rdf:type http://data.linkedmdb.org/.../film
mov
ie:re
dc
fo
late
d Bo
af
:t
ok
itle
:b
as
ed
http://www4.wi … /0743424425
_n
The Shining
ea
r
http://sws.geonames.org/2635167/
ow l
:sam
n e As
atio
l
be
o pu l
n:p
:la
g
s
http://dbpedia.org/resource/United_Kingdom
rdf
60943000
United Kingdom
Olaf Hartig - Linked Data on the Web
48. owl:sameAs
http://data.linkedmdb.org/.../2014
rdf:type http://data.linkedmdb.org/.../film
mov
ie:re
dc
fo
late
d Bo
af
:t
ok
itle
:b
as
ed
http://www4.wi … /0743424425
_n
The Shining
ea
r
http://sws.geonames.org/2635167/
ow l
:sam
n e As
atio
l
be
o pu l
n:p
:la
g
s
http://dbpedia.org/resource/United_Kingdom
rdf
60943000
r db
United Kingdom de :c
a all
p:le in
gC
db od
http://dbpedia.org/resource/Gordon_Brown e
44
Olaf Hartig - Linked Data on the Web
49. Outline
From a Web of Documents
to a Web of Data
Technical Foundations of Linked Data
Consuming Linked Data
Current Research Issues
Olaf Hartig - Linked Data on the Web
50. Consuming Linked Data
… by Humans
● Linked Data browsers
● Faceted browsers
● On-the-fly Linked Data Mashups
● Linked Data based applications
Olaf Hartig - Linked Data on the Web
51. Linked Data Browsers
● Provide a tabular view on retrieved RDF data
● Some integrate data from multiple sources
● Allow to follow RDF links
● Multiple options:
● Tabulator
● Disco
● OpenLink Data Explorer
● Zitgist Data Viewer
● Marbles
etc.
Olaf Hartig - Linked Data on the Web
54. Linked Data based Applications
[SFSW'09]
Olaf Hartig - Linked Data on the Web
55. New Kind of Applications
● Users retain full control over their data
● Users manage and publish data on their own
● All that is needed for the application is a URI
http://researchersmap.informatik.hu-berlin.de/data/dbprofs
…
<http://www.dbis.informatik.hu-berlin.de/ … /freytag.rdf#me>
rdf:type :DBProfessor .
…
Olaf Hartig - Linked Data on the Web
56. Users Really Own their Data
http://www.dbis.informatik.hu-berlin.de/ ... /freytag.rdf
…
<http://www.dbis.informatik.hu-berlin.de/ … /freytag.rdf#me>
contact:fullName "Prof. Johann-Christoph Freytag, Ph.D." ;
contact:office [ contact:address
[ contact:street "Rudower Chaussee 25" ;
contact:city "Berlin"^^xsd:string ;
contact:postalCode "12489"^^xsd:string ] ] ;
foaf:topic_interest
<http://dbpedia.org/resource/Query_optimization> ,
<http://dbpedia.org/resource/Privacy> ,
<http://dbpedia.org/resource/Data_quality> ,
<http://dbpedia.org/resource/Data_warehouse> ;
owl:sameAs
<http://dblp.l3s.de/d2r/resource/authors/Johann_Christoph_Freytag> .
…
Olaf Hartig - Linked Data on the Web
57. Consuming Linked Data
… in Applications
● Look up URIs and process the retrieved data
● Query with SPARQL
Olaf Hartig - Linked Data on the Web
58. Brief Introduction to SPARQL
● Query language for RDF data
● Main idea: pattern matching
● Describe subgraphs of the queried RDF graph
● Subgraphs that match your description yield a result
● Mean: graph patterns (i.e. RDF graphs with variables)
?v rdf:type
http://.../Volcano
Olaf Hartig - Linked Data on the Web
59. Brief Introduction to SPARQL
Queried
graph:
rdf:type
http://.../Mount_Baker http://.../Volcano
p:lastEruption rdf:type
"1880" htp://.../Mount_Etna
?v rdf:type
http://.../Volcano
Results:
?v
http://.../Mount_Baker
http://.../Mount_Etna
Olaf Hartig - Linked Data on the Web
60. Querying Linked Data with SPARQL
● Linked Data sources usually provide a SPARQL service
● Send your query, receive the result
Data Source Endpoint Address
DBpedia http://dbpedia.org/sparql
Musicbrainz http://dbtune.org/musicbrainz/sparql
U.S. Census http://www.rdfabout.com/sparql
Semantic Crunchbase http://cb.semsol.org/sparql
More complete list: http://esw.w3.org/topic/SparqlEndpoints
Olaf Hartig - Linked Data on the Web
61. Querying Linked Data with SPARQL
Querying a single dataset is quite boring
compared to:
Issuing SPARQL queries over multiple datasets
How can you do this?
● Issue follow-up queries to different endpoints
● Query a central collection of datasets
● Build store with copies of relevant datasets
● (Use query federation system)
● Use a link traversal based query system
Olaf Hartig - Linked Data on the Web
62. Querying Linked Data with SPARQL
Traditional approach 1:
data centralization
● Querying a collection of
copies from all relevant
datasets
Olaf Hartig - Linked Data on the Web
63. Querying Linked Data with SPARQL
Traditional approach 2:
federated query processing ?
● Querying a mediator which
distributes subqueries to
relevant sources and
integrates the results
?
? ?
Olaf Hartig - Linked Data on the Web
64. Main drawback:
You have to know the relevant
data sources in advance.
You restrict yourself to
the selected sources.
You do not tap the
full potential of
the Web !
Olaf Hartig - Linked Data on the Web
65. A novel approach:
Link Traversal Based Query Execution
[ISWC'09]
Olaf Hartig - Linked Data on the Web
66. Main Idea
● Intertwine query evaluation with traversal of RDF links
● Alternately:
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate
solutions and add retrieved data
to the queried data set
Queried data
Olaf Hartig - Linked Data on the Web
67. Main Idea
● Intertwine query evaluation with traversal of RDF links
● Alternately:
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
68. Main Idea
● Intertwine query evaluation with traversal of RDF links
Alternately:
htt
●
p:/
/.
Evaluate parts of the query on a
../m ?
●
continuously augmented set of data
ov
ie2
44
● Look up URIs in intermediate
9
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
69. Main Idea
● Intertwine query evaluation with traversal of RDF links
Alternately:
htt
●
p:/
/.
Evaluate parts of the query on a
../m ?
●
continuously augmented set of data
ov
ie2
44
● Look up URIs in intermediate
9
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
70. Main Idea
● Intertwine query evaluation with traversal of RDF links
Alternately:
htt
●
p:/
/.
Evaluate parts of the query on a
../m ?
●
continuously augmented set of data
ov
ie2
44
● Look up URIs in intermediate
9
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
71. Main Idea
● Intertwine query evaluation with traversal of RDF links
● Alternately:
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
72. Main Idea
● Intertwine query evaluation with traversal of RDF links
● Alternately:
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate
solutions and add retrieved data
to the queried data set
filmingLocation
http://.../movie2449 http://geo.../Italy
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
73. Main Idea
● Intertwine query evaluation with traversal of RDF links
?loc
● Alternately:
http://geo.../Italy
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate
solutions and add retrieved data
to the queried data set
filmingLocation
http://.../movie2449 http://geo.../Italy
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
74. Main Idea
● Intertwine query evaluation with traversal of RDF links
?loc
● Alternately:
http://geo.../Italy
● Evaluate parts of the query on a
? aly
continuously augmented set of data
./I t
..
g eo
Look up URIs in intermediate
://
●
p
htt
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
75. Main Idea
● Intertwine query evaluation with traversal of RDF links
?loc
● Alternately:
http://geo.../Italy
● Evaluate parts of the query on a
? aly
continuously augmented set of data
./I t
..
g eo
Look up URIs in intermediate
://
●
p
htt
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
76. Main Idea
● Intertwine query evaluation with traversal of RDF links
?loc
● Alternately:
http://geo.../Italy
● Evaluate parts of the query on a
? aly
continuously augmented set of data
./I t
..
g eo
Look up URIs in intermediate
://
●
p
htt
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
77. Main Idea
● Intertwine query evaluation with traversal of RDF links
?loc
● Alternately:
http://geo.../Italy
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
78. Main Idea
● Intertwine query evaluation with traversal of RDF links
?loc
● Alternately:
http://geo.../Italy
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate
solutions and add retrieved data
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
79. Main Idea
● Intertwine query evaluation with traversal of RDF links
?loc
● Alternately:
http://geo.../Italy
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate
solutions and add retrieved data
to the queried data set
tics http://stat.db/.../it
statis
http://geo.../Italy
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
80. Main Idea
● Intertwine query evaluation with traversal of RDF links
?loc
● Alternately:
http://geo.../Italy
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate ?loc ?stat
solutions and add retrieved data http://geo.../Italy http://stats.db/../it
to the queried data set
tics http://stat.db/.../it
statis
http://geo.../Italy
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
81. Main Idea
● Intertwine query evaluation with traversal of RDF links
?loc
● Alternately:
http://geo.../Italy
● Evaluate parts of the query on a
continuously augmented set of data
● Look up URIs in intermediate ?loc ?stat
solutions and add retrieved data http://geo.../Italy http://stats.db/../it
to the queried data set
Queried data
http://.../movie2449 s ?stat unem Query
filmin tis t ic p_ r a
g Loca sta te
t io n ?loc ?ur
Olaf Hartig - Linked Data on the Web
82. In a Nutshell
● Link traversal based query execution:
● Evaluation on a continuously augmented dataset
● Discovery of potentially relevant data during execution
● Discovery driven by intermediate solutions
● Main advantage:
● No need to know all data sources in advance
Olaf Hartig - Linked Data on the Web
83. Real-World Example
SELECT DISTINCT ?author ?phone WHERE {
?pub swc:isPartOf
<http://data.semanticweb.org/conference/eswc/2009/proceedings> .
?pub swc:hasTopic ?topic . ?topic rdfs:label ?topicLabel .
FILTER regex( str(?topicLabel), "ontology engineering", "i" ) .
?pub swrc:author ?author .
{ ?author owl:sameAs ?authorAlt }
Return phone numbers of
authors of ontology engineering papers
UNION
at ESWC'09.
{ ?authorAlt owl:sameAs ?author }
?authorAlt foaf:phone ?phone # of query results 2
} # of retrieved graphs 297
# of accessed servers 16
avg. execution time 1min 30sec
Olaf Hartig - Linked Data on the Web
84. Application
● Researchers Map implemented with SQUIN
● Query interface to the whole Web of Data
SELECT DISTINCT ?i ?label
WHERE {
?prof rdf:type <http://res ... data/dbprofs#DBProfessor> ;
foaf:topic_interest ?i .
OPTIONAL {
?i rdfs:label ?label
?
FILTER( LANG(?label)="en" || LANG(?label)="")
}
}
ORDER BY ?label
SQUIN
SemWeb
Client
Lib
Olaf Hartig - Linked Data on the Web
85. Application
SELECT DISTINCT ?i ?label
WHERE {
?prof rdf:type <http://res ... data/dbprofs#DBProfessor> .
?prof foaf:topic_interest ?i .
OPTIONAL {
?i rdfs:label ?label
FILTER( LANG(?label)="en" || LANG(?label)="")
}
}
ORDER BY ?label
Olaf Hartig - Linked Data on the Web
86. Application
● Implementation of Researchers Map was very easy due to:
● SQUIN / SemWeb Client Lib
● Approx. 700 LOC JavaScript (incl. 100 for the queries)
● Approx. 50 LOC PHP (Mainly to set up server side proxy
due to same origin policy)
● Convenient access to SQUIN with SQUIN PHP tools
$s = 'http:// …'; // address of the SQUIN service
$q = new SparqlQuerySock( $s, '… SELECT ...' );
$res = $q->getJsonResult(); // or getXmlResult()
● Try it: http://squin.org
Olaf Hartig - Linked Data on the Web
87. Consuming Linked Data
… getting started
Issues people have when they want to start:
● Finding URIs
● Finding additional data
● Finding SPARQL endpoints
Olaf Hartig - Linked Data on the Web
88. Finding URIs
Problem: What URIs exist that identify
the thing I'm interested in?
Two options:
● Data source specific solutions
● Some Linked Data sources provide a keyword based search
for things in their dataset(s)
● Search Engines for the Web of data
Olaf Hartig - Linked Data on the Web
91. Finding URIs
What if there is no search possibility?
You may try a SPARQL query:
SELECT DISTINCT ?s WHERE {
?s rdfs:label ?label .
FILTER regex( str(?label), "Berlin", "i" ) .
}
Olaf Hartig - Linked Data on the Web
92. Finding URIs
● Search engines for the Web of data provide keyword
based search for things in different datasets)
● Falcons http://iws.seu.edu.cn/services/falcons/
● Sindice http://sindice.com
● SWSE http://www.swse.org
● Watson http://watson.kmi.open.ac.uk
● They have also APIs
Olaf Hartig - Linked Data on the Web
96. Finding Additional Data
Problem: Given a URIs, where do I find
more data as what is available
by looking it up?
Three options:
● Follow links (e.g. rdfs:seeAlso, owl:sameAs)
● Use a search engine for the Web of data
● Use a co-reference service
● Co-reference services find different
URIs that refer to the same thing
● They may also provide an API
Olaf Hartig - Linked Data on the Web
99. Finding SPARQL Endpoints
Problem: What relevant endpoints exist?
Where is the SPARQL endpoint
for a dataset?
What is the data provided via a
SPARQL endpoint about?
● Look at: http://esw.w3.org/topic/SparqlEndpoints
● Still an open issue
Olaf Hartig - Linked Data on the Web
100. Outline
From a Web of Documents
to a Web of Data
Technical Foundations of Linked Data
Consuming Linked Data
Current Research Issues
Olaf Hartig - Linked Data on the Web
101. Linked Data Fusion
Applications want an integrated view on
all data that is available about a thing
Requirements:
● Schema mapping: map data into a single schema
● Identity resolution: smush data from all sources
● Conflict resolution: resolve inconsistencies in the data
Olaf Hartig - Linked Data on the Web
102. User Interfaces and Interaction
● How do we build interfaces that operate over such
a large amount of data?
● What will be their interaction paradigm?
● How to explain data provenance and data fusion?
Olaf Hartig - Linked Data on the Web
103. Provenance, Quality, and Trust
● There are no facts on the Web – everything is a claim
● Increasing amount of research in this area
● W3C provenance incubator group
● Our contributions so far:
● A provenance model for the Web of data [LDOW'09]
● A provenance based Information Quality assessment method
[SWPM'09]
● tSPARQL – a trust aware extension for SPARQL [ESWC'09]
Olaf Hartig - Linked Data on the Web
104. Take-away Summary
The traditional Web of documents
evolves into a Web of data.
● Entities are connected by data links
● Data is self-describing
● Anyone can publish data to the Web of data
● Linked Data holds an enormous potential: users may
benefit from a virtually unbound set of data sources
● Learn more about Linked Data:
● “Linked Data – The Story So Far”
by C. Bizer, T. Heath, T. Berners-Lee
● On consuming Linked Data: http://consuminglinkeddata.org
Olaf Hartig - Linked Data on the Web
105. These slides have been created by
Olaf Hartig
http://olafhartig.de
Some slides are based on slide sets provided by
● Christian Bizer
● Juan Sequeda
This work is licensed under a
Creative Commons Attribution-Share Alike 3.0 License
(http://creativecommons.org/licenses/by-sa/3.0/)
Olaf Hartig - Linked Data on the Web