SlideShare une entreprise Scribd logo
1  sur  117
Télécharger pour lire hors ligne
The Digital Cavemen

of Linked Lascaux
Ruben Verborgh
The Lascaux paintings

are 17,300 years old.
How long will

your records last?
by Banksy
by Moyan Brenn
SUSTAINABILITY
SUSTAINABILITY
a threat to the Semantic Web
lack of a longterm plan for
=
SUSTAINABILITY
making promises you can keep
=
SUSTAINABILITY
a dialog becoming a contract
=
SUSTAINABILITY
remaining constant under change
=
How can we promise
to remain constant
in a changing world?
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Changes
Data models
Technology
Interfaces
Changes
Data models
Technology
Interfaces
The oldest data model

is a simple table.
header
row
column
k
van Hooland, S. and Verborgh, R.

“Linked Data for Libraries, Archives and Museums” (Facet, 2014)
Tables do not cope well

with changes in data or schema.
Title Artist Born Died
The Thrill is Gone B. B. King 1925 2015
Riding with the King John Hiatt 1952
Riding with the King B. B. King 1925
… … … …
Relational databases provide

a multi-dimensional table model.
7
header
row
relation
key column
attributes
table/entity
van Hooland, S. and Verborgh, R.

“Linked Data for Libraries, Archives and Museums” (Facet, 2014)
Databases cope with data changes

but schema changes are harder.
Title Artist
The Thrill is Gone 1
Riding with the King 2
Riding with the King 1
… …
ID Name Born Died
1 B. B. King 1925 2015
2 John Hiatt 1952
… … … …
There is no interoperability

with other databases.
Title Artist
The Thrill is Gone 1
Riding with the King 2
Riding with the King 1
… …
Wikipedia
?
XML allows reuse of schemas

and identifiers.
the same for all items; a header
line can indicate their name.
Rec
ers
root
parent
child
siblings
subje
van Hooland, S. and Verborgh, R.

“Linked Data for Libraries, Archives and Museums” (Facet, 2014)
XML schema evolution

remains a tough nut to crack.
Tabular data Relational model
Meta-markup languages RDF
Each data item is structured as
a line of field values. Fields are
the same for all items; a header
line can indicate their name.
Data are structured as tables, each of
which has its own set of attributes.
Records in one table can relate to oth-
ers by referencing their key column.
XML documents have a hierarchical
structure, which gives them a tree-
like appearance. Each element can
Each fact about a data item is expressed
as a triple, which connects a subject to
an object through a precise relationship.
root
parent
child
siblings
property
subject
object
?
The RDF datamodel is flexible

for changes in data and schema.
RDF
Records in one table can relate to oth-
ers by referencing their key column.
ent
child
s
property
subject
object
van Hooland, S. and Verborgh, R.

“Linked Data for Libraries, Archives and Museums” (Facet, 2014)
RDF involves a trade-off

between flexibility and reuse.
custom

ontology
reuse

ontologies
perfect

match
perfect

interoperability
So far for change within models…

what about change between them?
1.1. INTRODUCTION 7
Tabular data Relational model
Each data item is structured as
a line of field values. Fields are
the same for all items; a header
line can indicate their name.
Data are structured as tables, each of
which has its own set of attributes.
Records in one table can relate to oth-
ers by referencing their key column.
header
row
column
relation
key column
attributes
table/entity
root
Tabular data Relational model
Each data item is structured as
a line of field values. Fields are
the same for all items; a header
line can indicate their name.
Data are structured as tables, each of
which has its own set of attributes.
Records in one table can relate to oth-
ers by referencing their key column.
header
row
column
relation
key column
table/entity
root
parent
child
siblings
property
subject
object
Tabular data Relational model
Meta-markup languages RDF
Each data item is structured as
a line of field values. Fields are
the same for all items; a header
line can indicate their name.
Data are structured as tables, each of
which has its own set of attributes.
Records in one table can relate to oth-
ers by referencing their key column.
header
row
column
relation
key column
table/entity
root
parent
child
siblings
property
subject
object
1.1. INTRODUCTION 7
Tabular data Relational model
Each data item is structured as
a line of field values. Fields are
the same for all items; a header
line can indicate their name.
Data are structured as tables, each of
which has its own set of attributes.
Records in one table can relate to oth-
ers by referencing their key column.
header
row
column
relation
key column
attributes
table/entity
root
parent
child
property
subject
object
There’s no ultimate model.

They co-exist. Change is inherent.
1.1. INTRODUCTION 7
Tabular data Relational model
Each data item is structured as
a line of field values. Fields are
the same for all items; a header
line can indicate their name.
Data are structured as tables, each of
which has its own set of attributes.
Records in one table can relate to oth-
ers by referencing their key column.
header
row
column
relation
key column
attributes
table/entity
root
Tabular data Relational model
Each data item is structured as
a line of field values. Fields are
the same for all items; a header
line can indicate their name.
Data are structured as tables, each of
which has its own set of attributes.
Records in one table can relate to oth-
ers by referencing their key column.
header
row
column
relation
key column
table/entity
root
parent
child
siblings
property
subject
object
Tabular data Relational model
Meta-markup languages RDF
Each data item is structured as
a line of field values. Fields are
the same for all items; a header
line can indicate their name.
Data are structured as tables, each of
which has its own set of attributes.
Records in one table can relate to oth-
ers by referencing their key column.
header
row
column
relation
key column
table/entity
root
parent
child
siblings
property
subject
object
1.1. INTRODUCTION 7
Tabular data Relational model
Each data item is structured as
a line of field values. Fields are
the same for all items; a header
line can indicate their name.
Data are structured as tables, each of
which has its own set of attributes.
Records in one table can relate to oth-
ers by referencing their key column.
header
row
column
relation
key column
attributes
table/entity
root
parent
child
property
subject
object
Changes
Data models
Technology
Interfaces
Even if your data doesn’t change,

technology does.
What happens to your data?
new software versions
new software manufacturers
Is your software

holding your data hostage?
Is your software the owner of your data?
Intentional or unintentional vendor lock-in?
Or are you?
Can you get your data out at any moment you want?
The Cooper-Hewitt Design Museum
had trouble getting their own data.
Data in The Museum System
flexible, but complex relational design
no export button
Website had more flexible demands
complex manual queries to liberate data
parallel CMS to drive website
Changes
Data models
Technology
Interfaces
The Web has been designed

with change in mind.
Individual links are allowed to break

so the entire Web does not.
—Tim Berners-Lee
The Web is in rapid evolution

but continues on working.
What year is it? Then your users need…
1995 – HTML 2.0
2000 – XML
2008 – JSON
2012 – HTML 5
2015 – RDF ?
2017 – … ?
At least HTML seems constant,

so the human Web is safe.
http://bib.org/books/978-1-85604-964-1/
around 2005: made in HTML 4
around 2015: made in HTML 5
Markup changes, the identifier does not.
Tim Berners-Lee called these “Cool URIs”.
Web APIs for machines suffer

from changes on many levels.
http://api.bib.org/v2/viewBookDetails.php?
id=978-1-85604-964-1&format=json

&apikey=WSDGU56VP
How does this identifier cope with change?
How long does this identifier work unchanged?
!
http://api.bib.org/v2/viewBookDetails.php?
id=978-1-85604-964-1&format=json

&apikey=WSDGU56VP
!
!
!
Web APIs for machines suffer

from changes on many levels.




dependency on server technology
dependency on API version
dependency on representation
dependency on API key
Plenty of excuses exist

to change machine interfaces.
But our new server does it faster!
But our new API has different features!
But XML is obsolete now so we need JSON!
Even funnier are the excuses

for requiring API keys.
But we need to rate limit!
But we need to track automated access!
But we need to protect our data!
Once and for all:

API keys do not help with these.
But we need to rate limit!
But we need to track automated access!
But we need to protect our data!
Once and for all:

API keys do not help with these.
Your HTML interface is still open!
JSON is a convenience, not a necessity.
Anybody can still do whatever they want

by scraping HTML pages with the same data.
Protect your data, not just one interface.
Yet other possible changes

still appear to be a concern.
Remain constant if your server changes?
Remain constant if your API changes?
Remain constant if data models change?
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Constants
URIs
Ontologies
Resources
Constants
URIs
Ontologies
Resources
The RDF model is driven

by unique identifiers.
S
O
P
Constants allow clients

to establish a shared meaning.
S
O
P
http://bib.org/books/978-1-85604-964-1/
http://bib.org/authors/7356/
http://purl.org/dc/terms/creator
Human semantics are in concepts

and their meaning to the world.
S
O
P
a book
a person
written by
Machine semantics are in symbols

and their structural interrelations.
S
O
P
http://digybe.wpq/dgjyj-dgu7945
http://aole.wqq/mobd1.tihz
http://yudgy.jdu/DHH8DHBtkixhj
We need to be very careful

about our choice of symbols.
S
O
P
http://bib.org/books/978-1-85604-964-1/
http://bib.org/authors/7356/
http://purl.org/dc/terms/creator
We need to be very careful

about our choice of symbols.
http://bib.org/books/978-1-85604-964-1/
http://bib.org/authors/7356/
Is this a book

or a description of a book?
:printDate "2014-06-11"
:lastModified "2015-11-25"
Is this a person

or a document?
:birthDate "1987-02-28"
:size "17kB"
Although designed for machines,

the example only works for humans.
S
O
P
http://bib.org/books/978-1-85604-964-1/
http://bib.org/authors/7356/
http://purl.org/dc/terms/creator
Because, somehow, Web APIs

make machine access different.
S
O
P
http://api.bib.org/v2/viewBookDetails.php?
id=978-1-85604-964-1&format=json

&apikey=WSDGU56VP
http://api.bib.org/v2/viewAuthorProfile.php?
id=7356&format=json&apikey=WSDGU56VP
http://purl.org/dc/terms/creator
That’s why it’s a problem if

machines need different identifiers.
S
O
P
http://api.bib.org/v2/viewBookDetails.php?
id=978-1-85604-964-1&format=json

&apikey=WSDGU56VP
http://api.bib.org/v2/viewAuthorProfile.php?
id=7356&format=json&apikey=WSDGU56VP
http://purl.org/dc/terms/creator
Only this triple is a global constant.

The other is volatile and local.
S
O
P
http://bib.org/books/978-1-85604-964-1/
http://bib.org/authors/7356/
http://purl.org/dc/terms/creator
Constants
URIs
Ontologies
Resources
Fortunately, we don’t have to

pick all the constants ourselves.
Ontologies provide identifiers of concepts

that are designed to be reused.
They are necessary to make RDF work.
They are necessary to create queries,

especially over multiple datasources.
Of course, we get the benefits

only if we actually reuse.
Why have our own my:writtenBy property

when dc:creator already exists?
Maybe we have a more specific meaning?
We can still relate both properties with RDF.
But if we all use derivatives of the constants,

what is the value of these constants?
Authors are not always in control:

external semantic drift happens.
foaf:knows was bidirectional…
spec: “some level of reciprocity”
An foaf:knows Pete Peter foaf:knows An
…until somebody modeled Twitter followers
Pete follows Angela Merkel Pete knows Angela
Yet Angela doesn’t know Pete…
Getting close to Derrida…
but we’re not philosophers.
There are only two hard things

in Computer Science:

cache invalidation and naming things.
—Phil Karlton
Constants
URIs
Ontologies
Resources
The constants you can touch

are the constants you can trust.
No matter how hard technology changes,

the books we describe remain the same.
Any mechanism of identification

should based on domain resources,

not on inevitably changing technology.
The “success” story

of the Web API community.
e existence of more than 12.000 di↵erent micro-protocols to achieve essen
en clients and servers over http. Of course, each application has its own
t does that also warrant an entirely di↵erent way of exposing this, especially
Each di↵erent api currently requires a di↵erent client, given the lack of a u
pi’s response structure and functionality. Clearly, this approach to Web apis i
2005 2007 2009 2011 2013 2015
186
1,263
2,418
5,018
7,182
10,302
12,559
number of indexed Web s
g number of Web apis is often named an indicator of their success, while the ove
ssary—and detrimental to the development of generic Web api clients. (data: progra
number of indexed Web APIs

in ProgrammableWeb
Just imagine we had

15,000 different data models.
e existence of more than 12.000 di↵erent micro-protocols to achieve essen
en clients and servers over http. Of course, each application has its own
t does that also warrant an entirely di↵erent way of exposing this, especially
Each di↵erent api currently requires a di↵erent client, given the lack of a u
pi’s response structure and functionality. Clearly, this approach to Web apis i
2005 2007 2009 2011 2013 2015
186
1,263
2,418
5,018
7,182
10,302
12,559
number of indexed Web s
g number of Web apis is often named an indicator of their success, while the ove
ssary—and detrimental to the development of generic Web api clients. (data: progra
number of indexed Web APIs

in ProgrammableWeb
Find resources in your domain

and assign them an identifier.
http://bib.org/books/978-1-85604-964-1/
http://bib.org/authors/7356/
It’s just like building a web site.

When a user comes, serve HTML.
http://bib.org/books/978-1-85604-964-1/
U
GET
HTML
It’s just like building a web site.

When a client comes, serve JSON.
http://bib.org/books/978-1-85604-964-1/
C
GET
JSON
It’s just like building a web site.

When a client comes, serve RDF.
http://bib.org/books/978-1-85604-964-1/
C
GET
RDF
Content negotiation exists

for a long time in HTTP.
http://bib.org/books/978-1-85604-964-1/
C
GET
RDF
Resource
Representation
This allows constant URIs

even with future changes.
http://bib.org/books/978-1-85604-964-1/
C
GET
RDF 2.0
It enables different users and

machines to talk about things.
http://bib.org/books/978-1-85604-964-1/
C
U
C
The best API is no API.
Your website is already an API.
Developers like to build complicated APIs.
API keys are especially cool to build.
Every feature and change comes with a high cost.
If you ask for an API, you’ll get one.
Ask for new representations

of your resources instead.
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Promises
Web Data
Integration
Scalability
Promises
Web Data
Integration
Scalability
The Semantic Web promised

data on the Web.
85,567,007,302 triples from 3,426 datasets
LODStats
38,606,408,765 from 657,896 entries
LOD Laundromat
How much of this data

can we readily access?
data dumps
Linked Data documents
SPARQL endpoints
A data dump means downloading
everything and querying locally.
A data dump means downloading
everything and querying locally.
When was the last time

you downloaded the full Wikipedia

just because you had one question?
Dumps are not Web querying.

It’s kind of like giving up.
Semantic Web Semantic Basement?
What advantage do we have

compared to Big Data?
Still the RDF data model…
But the major difference is Web.
Linked Data documents

allow you to traverse a dataset.
Linked Data documents

allow you to traverse a dataset.
That’s similar to what we also do:

consume information on Wikipedia

by following links.
Much Linked Data is available

using the well-known principles.
Servers publish a light-weight interface.
Clients follow their nose

to retrieve information.
Linked Data documents allow

query evaluation on the Web.
# Other books by the same author

SELECT DISTINCT ?book WHERE {

books:85604 dc:creator ?author.

?book dc:creator ?author.

}
Some queries are hard

or impossible to evaluate.
# Books about Hamburg

SELECT DISTINCT ?book ?author WHERE {

?book dc:subject dbpedia:Hamburg.

?book dc:creator ?author.

}
SPARQL endpoints allow you

to ask any question you want.
SPARQL endpoints allow you

to ask any question you want.
When was the last time

you expected Wikipedia to answer

specific questions automatically for you?
A public SPARQL endpoint

happily answers this query.
# Other books by the same author

SELECT DISTINCT ?book WHERE {

books:85604 dc:creator ?author.

?book dc:creator ?author.

}
A public SPARQL endpoint also

happily answers this query.
# Books about Hamburg

SELECT DISTINCT ?book ?author WHERE {

?book dc:subject dbpedia:Hamburg.

?book dc:creator ?author.

}
A public SPARQL endpoint also

happily answers this query…
SELECT DISTINCT ?drug ?drug1 ?drug2 ?drug3 ?drug4 ?d1 WHERE {
?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-
berlin.de/drugbank/resource/drugcategory/antibiotics> .
?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-
berlin.de/drugbank/resource/drugcategory/antiviralAgents> .
?drug3 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-
berlin.de/drugbank/resource/drugcategory/antihypertensiveAgents> .
?drug4 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu-
berlin.de/drugbank/resource/drugcategory/anti-bacterialAgents> .
?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o1 .
?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g1 .
?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l1 .
?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw1 .
?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/hprdId> ?hp1 .
?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/swissprotName> ?sn1 .
?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/proteinSequence> ?ps1 .
?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/generalReference> ?gr1 .
?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target>?o1 .
?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o2 .
?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g2 .
?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l2 .
?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw2 .
There’s a price to pay for being

the most expressive HTTP interface.
The majority of public SPARQL endpoints

has less than 95% uptime.
This means we cannot query them

for more than 1.5 days each month.
This means we cannot rely on them

to build Linked Data applications.
Buil-Aranda – Hogan – Umbrich – Vandenbussche

SPARQL Web-Querying Infrastructure: Ready for Action?
Promises
Web Data
Integration
Scalability
The main promise of Linked Data

is integration, preserving semantics.
RDF
Records in one table can relate to oth-
ers by referencing their key column.
ent
child
s
property
subject
object
Integration is the promise.

But does it work on the Web?
data dumps
Linked Data documents
SPARQL endpoints
With data dumps, we just

build a bigger basement.
How far do we go?
How do we keep data up to date?
With Linked Data documents,

we keep on following our nose.
There are no dataset boundaries.
Some queries will remain hard.
With public SPARQL endpoints,

problems become worse.
1 endpoint has 95% availability.
1.5 days down each month
2 endpoints have 90% availability.
3 days down each month
3 endpoints have 85% availability.
4.5 days down each month
Promises
Web Data
Integration
Scalability
Can we think differently

about Linked Data on the Web?
high server costlow server cost
data

dump
SPARQL

endpoint
high availability low availability
high bandwidth low bandwidth
out-of-date data live data
low client costhigh client cost
Linked Data

documents
Can we think differently

about Linked Data on the Web?
data

dump
SPARQL

endpoint
Linked Data

documents
? ?
Let us combine the lessons on

changes, constants, and promises.
An interface that withstands change,
simple enough so it doesn’t break
complex enough to query.
Let us combine the lessons on

changes, constants, and promises.
Data dumps contain too much.
SPARQL endpoint results are too specific.
Linked Data documents are unidirectional.
Each interface divides a dataset
into Linked Data Fragments.
Data dumps: 1 huge fragment
SPARQL endpoints: ∞ specific fragments
Linked Data: 1 fragment per subject
Can we find a new interface

with a sustainable balance?
Triple Pattern Fragments:

1 fragment per subject / predicate / object
Browse a dataset by triple pattern—

no less, no more.
Machines can access

the exact same interface as RDF.
Triple Pattern Fragments extend

Linked Data documents with forms.
That’s even more similar to what we do:

consume information on the Wikipedia

by following links and using forms.
Machines solve complex queries

by breaking them down.
# Other books by the same author

SELECT DISTINCT ?book WHERE {

books:85604 dc:creator ?author.

?book dc:creator ?author.

}
Machines solve complex queries

by breaking them down.
# Books about Hamburg

SELECT DISTINCT ?book ?author WHERE {

?book dc:subject dbpedia:Hamburg.

?book dc:creator ?author.

}
Promises can be kept, because

the interface is intelligently light.
Publishing Linked Data

that can be queried on the Web

is realistic because the workload is divided.
The server doesn’t even need a triplestore.
Since the client is in charge,

querying multiple sources is easy.
Promises are negotiated contracts
so they always involve trade-offs.
Querying will be slower.
clients send many requests to answer a query
Query times are more consistent.
0.3 secs with a SPARQL endpoint… 95% of time
3 secs with Triple Pattern Fragments… 99.9% of time
Experiment with more complex interfaces.
Make your Linked Data

queryable on the Web.
Several open-source implementations:

linkeddatafragments.org/software/
Query one or multiple sources online:

client.linkeddatafragments.org
Example: bit.ly/harvard-hamburg
Changes
Constants
Promises
The Digital Cavemen

of Linked Lascaux
Identify the constants,

separate them from changes.
Satisfy Linked Data needs

with promises you can keep.
Simple enough

to be usable,
complex enough

to be useful.
Sustainability means

promising the simplest

useful complexity.
@RubenVerborgh

ruben.verborgh.org

Contenu connexe

Tendances

Functional Composition of Sensor Web APIs
Functional Composition of Sensor Web APIsFunctional Composition of Sensor Web APIs
Functional Composition of Sensor Web APIs
Ruben Verborgh
 
Distributed Affordance
Distributed AffordanceDistributed Affordance
Distributed Affordance
Ruben Verborgh
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
Juan Sequeda
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
Juan Sequeda
 
RESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumptionRESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumption
Ruben Verborgh
 
Web data from R
Web data from RWeb data from R
Web data from R
schamber
 
RESTful JSON web databases
RESTful JSON web databasesRESTful JSON web databases
RESTful JSON web databases
kriszyp
 

Tendances (20)

Reasoned SPARQL
Reasoned SPARQLReasoned SPARQL
Reasoned SPARQL
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
The Lonesome LOD Cloud
The Lonesome LOD CloudThe Lonesome LOD Cloud
The Lonesome LOD Cloud
 
Linking media, data, and services
Linking media, data, and servicesLinking media, data, and services
Linking media, data, and services
 
Functional Composition of Sensor Web APIs
Functional Composition of Sensor Web APIsFunctional Composition of Sensor Web APIs
Functional Composition of Sensor Web APIs
 
The web – A hypermedia story
The web – A hypermedia storyThe web – A hypermedia story
The web – A hypermedia story
 
Distributed Affordance
Distributed AffordanceDistributed Affordance
Distributed Affordance
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
 
Introduction to Linked Data 1/5
Introduction to Linked Data 1/5Introduction to Linked Data 1/5
Introduction to Linked Data 1/5
 
Hypermedia Cannot be the Engine
Hypermedia Cannot be the EngineHypermedia Cannot be the Engine
Hypermedia Cannot be the Engine
 
Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefine
 
(Re-)Discovering Lost Web Pages
(Re-)Discovering Lost Web Pages(Re-)Discovering Lost Web Pages
(Re-)Discovering Lost Web Pages
 
RESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumptionRESTdesc – Efficient runtime service discovery and consumption
RESTdesc – Efficient runtime service discovery and consumption
 
Do it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLioDo it on your own - From 3 to 5 Star Linked Open Data with RMLio
Do it on your own - From 3 to 5 Star Linked Open Data with RMLio
 
2010 Sopac Cosugi
2010 Sopac Cosugi2010 Sopac Cosugi
2010 Sopac Cosugi
 
Web data from R
Web data from RWeb data from R
Web data from R
 
RESTful JSON web databases
RESTful JSON web databasesRESTful JSON web databases
RESTful JSON web databases
 
Introduction to OpenRefine
Introduction to OpenRefineIntroduction to OpenRefine
Introduction to OpenRefine
 
Tutorial Linked APIs
Tutorial Linked APIsTutorial Linked APIs
Tutorial Linked APIs
 
On the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly WebOn the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly Web
 

Similaire à The Digital Cavemen of Linked Lascaux

Part2- The Atomic Information Resource
Part2- The Atomic Information ResourcePart2- The Atomic Information Resource
Part2- The Atomic Information Resource
JEAN-MICHEL LETENNIER
 
ความรู้เบื้องต้นฐานข้อมูล 1
ความรู้เบื้องต้นฐานข้อมูล 1ความรู้เบื้องต้นฐานข้อมูล 1
ความรู้เบื้องต้นฐานข้อมูล 1
Witoon Thammatuch-aree
 
introductiontodatabases-151106233350-lva1-app6892(2).pptx
introductiontodatabases-151106233350-lva1-app6892(2).pptxintroductiontodatabases-151106233350-lva1-app6892(2).pptx
introductiontodatabases-151106233350-lva1-app6892(2).pptx
KvkExambranch
 
Dbms Lec Uog 02
Dbms Lec Uog 02Dbms Lec Uog 02
Dbms Lec Uog 02
smelltulip
 
NoSQL_Databases
NoSQL_DatabasesNoSQL_Databases
NoSQL_Databases
Rick Perry
 
Open Conceptual Data Models
Open Conceptual Data ModelsOpen Conceptual Data Models
Open Conceptual Data Models
rumito
 

Similaire à The Digital Cavemen of Linked Lascaux (20)

Presentation1
Presentation1Presentation1
Presentation1
 
Part2- The Atomic Information Resource
Part2- The Atomic Information ResourcePart2- The Atomic Information Resource
Part2- The Atomic Information Resource
 
Key,ID Field and Tables Relationship
Key,ID Field and Tables Relationship Key,ID Field and Tables Relationship
Key,ID Field and Tables Relationship
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?
 
ความรู้เบื้องต้นฐานข้อมูล 1
ความรู้เบื้องต้นฐานข้อมูล 1ความรู้เบื้องต้นฐานข้อมูล 1
ความรู้เบื้องต้นฐานข้อมูล 1
 
introductiontodatabases-151106233350-lva1-app6892(2).pptx
introductiontodatabases-151106233350-lva1-app6892(2).pptxintroductiontodatabases-151106233350-lva1-app6892(2).pptx
introductiontodatabases-151106233350-lva1-app6892(2).pptx
 
Making the semantic web work
Making the semantic web workMaking the semantic web work
Making the semantic web work
 
Introduction to databases
Introduction to databasesIntroduction to databases
Introduction to databases
 
DB- Lect #1 Intro.pdf
DB- Lect #1 Intro.pdfDB- Lect #1 Intro.pdf
DB- Lect #1 Intro.pdf
 
Database model BY ME
Database model BY MEDatabase model BY ME
Database model BY ME
 
Dbms Lec Uog 02
Dbms Lec Uog 02Dbms Lec Uog 02
Dbms Lec Uog 02
 
Manjeet Singh.pptx
Manjeet Singh.pptxManjeet Singh.pptx
Manjeet Singh.pptx
 
NoSQL_Databases
NoSQL_DatabasesNoSQL_Databases
NoSQL_Databases
 
Introduction to database
Introduction to databaseIntroduction to database
Introduction to database
 
Databases and its representation
Databases and its representationDatabases and its representation
Databases and its representation
 
Using Hyperlinks to Enrich Message Board Content with Linked Data
Using Hyperlinks to Enrich Message Board Content with Linked DataUsing Hyperlinks to Enrich Message Board Content with Linked Data
Using Hyperlinks to Enrich Message Board Content with Linked Data
 
Data models and ro
Data models and roData models and ro
Data models and ro
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Sql interview questions and answers
Sql interview questions and  answersSql interview questions and  answers
Sql interview questions and answers
 
Open Conceptual Data Models
Open Conceptual Data ModelsOpen Conceptual Data Models
Open Conceptual Data Models
 

Dernier

📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
@Chandigarh #call #Girls 9053900678 @Call #Girls in @Punjab 9053900678
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
@Chandigarh #call #Girls 9053900678 @Call #Girls in @Punjab 9053900678
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
imonikaupta
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
nirzagarg
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
ydyuyu
 

Dernier (20)

Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
 
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
 
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting  High Prof...
VIP Model Call Girls Hadapsar ( Pune ) Call ON 9905417584 Starting High Prof...
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)
 
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Prashant Vihar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Bilaspur Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
VIP Model Call Girls NIBM ( Pune ) Call ON 8005736733 Starting From 5K to 25K...
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 

The Digital Cavemen of Linked Lascaux

  • 1. The Digital Cavemen
 of Linked Lascaux Ruben Verborgh
  • 2.
  • 3.
  • 4. The Lascaux paintings
 are 17,300 years old. How long will
 your records last?
  • 8. SUSTAINABILITY a threat to the Semantic Web lack of a longterm plan for =
  • 12. How can we promise to remain constant in a changing world?
  • 17. The oldest data model
 is a simple table. header row column k van Hooland, S. and Verborgh, R.
 “Linked Data for Libraries, Archives and Museums” (Facet, 2014)
  • 18. Tables do not cope well
 with changes in data or schema. Title Artist Born Died The Thrill is Gone B. B. King 1925 2015 Riding with the King John Hiatt 1952 Riding with the King B. B. King 1925 … … … …
  • 19. Relational databases provide
 a multi-dimensional table model. 7 header row relation key column attributes table/entity van Hooland, S. and Verborgh, R.
 “Linked Data for Libraries, Archives and Museums” (Facet, 2014)
  • 20. Databases cope with data changes
 but schema changes are harder. Title Artist The Thrill is Gone 1 Riding with the King 2 Riding with the King 1 … … ID Name Born Died 1 B. B. King 1925 2015 2 John Hiatt 1952 … … … …
  • 21. There is no interoperability
 with other databases. Title Artist The Thrill is Gone 1 Riding with the King 2 Riding with the King 1 … … Wikipedia ?
  • 22. XML allows reuse of schemas
 and identifiers. the same for all items; a header line can indicate their name. Rec ers root parent child siblings subje van Hooland, S. and Verborgh, R.
 “Linked Data for Libraries, Archives and Museums” (Facet, 2014)
  • 23. XML schema evolution
 remains a tough nut to crack. Tabular data Relational model Meta-markup languages RDF Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. XML documents have a hierarchical structure, which gives them a tree- like appearance. Each element can Each fact about a data item is expressed as a triple, which connects a subject to an object through a precise relationship. root parent child siblings property subject object ?
  • 24. The RDF datamodel is flexible
 for changes in data and schema. RDF Records in one table can relate to oth- ers by referencing their key column. ent child s property subject object van Hooland, S. and Verborgh, R.
 “Linked Data for Libraries, Archives and Museums” (Facet, 2014)
  • 25. RDF involves a trade-off
 between flexibility and reuse. custom
 ontology reuse
 ontologies perfect
 match perfect
 interoperability
  • 26. So far for change within models…
 what about change between them? 1.1. INTRODUCTION 7 Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column attributes table/entity root Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column table/entity root parent child siblings property subject object Tabular data Relational model Meta-markup languages RDF Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column table/entity root parent child siblings property subject object 1.1. INTRODUCTION 7 Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column attributes table/entity root parent child property subject object
  • 27. There’s no ultimate model.
 They co-exist. Change is inherent. 1.1. INTRODUCTION 7 Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column attributes table/entity root Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column table/entity root parent child siblings property subject object Tabular data Relational model Meta-markup languages RDF Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column table/entity root parent child siblings property subject object 1.1. INTRODUCTION 7 Tabular data Relational model Each data item is structured as a line of field values. Fields are the same for all items; a header line can indicate their name. Data are structured as tables, each of which has its own set of attributes. Records in one table can relate to oth- ers by referencing their key column. header row column relation key column attributes table/entity root parent child property subject object
  • 29. Even if your data doesn’t change,
 technology does. What happens to your data? new software versions new software manufacturers
  • 30. Is your software
 holding your data hostage? Is your software the owner of your data? Intentional or unintentional vendor lock-in? Or are you? Can you get your data out at any moment you want?
  • 31. The Cooper-Hewitt Design Museum had trouble getting their own data. Data in The Museum System flexible, but complex relational design no export button Website had more flexible demands complex manual queries to liberate data parallel CMS to drive website
  • 33. The Web has been designed
 with change in mind. Individual links are allowed to break
 so the entire Web does not. —Tim Berners-Lee
  • 34. The Web is in rapid evolution
 but continues on working. What year is it? Then your users need… 1995 – HTML 2.0 2000 – XML 2008 – JSON 2012 – HTML 5 2015 – RDF ? 2017 – … ?
  • 35. At least HTML seems constant,
 so the human Web is safe. http://bib.org/books/978-1-85604-964-1/ around 2005: made in HTML 4 around 2015: made in HTML 5 Markup changes, the identifier does not. Tim Berners-Lee called these “Cool URIs”.
  • 36. Web APIs for machines suffer
 from changes on many levels. http://api.bib.org/v2/viewBookDetails.php? id=978-1-85604-964-1&format=json
 &apikey=WSDGU56VP How does this identifier cope with change? How long does this identifier work unchanged? !
  • 37. http://api.bib.org/v2/viewBookDetails.php? id=978-1-85604-964-1&format=json
 &apikey=WSDGU56VP ! ! ! Web APIs for machines suffer
 from changes on many levels. 
 
 dependency on server technology dependency on API version dependency on representation dependency on API key
  • 38. Plenty of excuses exist
 to change machine interfaces. But our new server does it faster! But our new API has different features! But XML is obsolete now so we need JSON!
  • 39. Even funnier are the excuses
 for requiring API keys. But we need to rate limit! But we need to track automated access! But we need to protect our data!
  • 40. Once and for all:
 API keys do not help with these. But we need to rate limit! But we need to track automated access! But we need to protect our data!
  • 41. Once and for all:
 API keys do not help with these. Your HTML interface is still open! JSON is a convenience, not a necessity. Anybody can still do whatever they want
 by scraping HTML pages with the same data. Protect your data, not just one interface.
  • 42. Yet other possible changes
 still appear to be a concern. Remain constant if your server changes? Remain constant if your API changes? Remain constant if data models change?
  • 46. The RDF model is driven
 by unique identifiers. S O P
  • 47. Constants allow clients
 to establish a shared meaning. S O P http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ http://purl.org/dc/terms/creator
  • 48. Human semantics are in concepts
 and their meaning to the world. S O P a book a person written by
  • 49. Machine semantics are in symbols
 and their structural interrelations. S O P http://digybe.wpq/dgjyj-dgu7945 http://aole.wqq/mobd1.tihz http://yudgy.jdu/DHH8DHBtkixhj
  • 50. We need to be very careful
 about our choice of symbols. S O P http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ http://purl.org/dc/terms/creator
  • 51. We need to be very careful
 about our choice of symbols. http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ Is this a book
 or a description of a book? :printDate "2014-06-11" :lastModified "2015-11-25" Is this a person
 or a document? :birthDate "1987-02-28" :size "17kB"
  • 52. Although designed for machines,
 the example only works for humans. S O P http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ http://purl.org/dc/terms/creator
  • 53. Because, somehow, Web APIs
 make machine access different. S O P http://api.bib.org/v2/viewBookDetails.php? id=978-1-85604-964-1&format=json
 &apikey=WSDGU56VP http://api.bib.org/v2/viewAuthorProfile.php? id=7356&format=json&apikey=WSDGU56VP http://purl.org/dc/terms/creator
  • 54. That’s why it’s a problem if
 machines need different identifiers. S O P http://api.bib.org/v2/viewBookDetails.php? id=978-1-85604-964-1&format=json
 &apikey=WSDGU56VP http://api.bib.org/v2/viewAuthorProfile.php? id=7356&format=json&apikey=WSDGU56VP http://purl.org/dc/terms/creator
  • 55. Only this triple is a global constant.
 The other is volatile and local. S O P http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/ http://purl.org/dc/terms/creator
  • 57. Fortunately, we don’t have to
 pick all the constants ourselves. Ontologies provide identifiers of concepts
 that are designed to be reused. They are necessary to make RDF work. They are necessary to create queries,
 especially over multiple datasources.
  • 58. Of course, we get the benefits
 only if we actually reuse. Why have our own my:writtenBy property
 when dc:creator already exists? Maybe we have a more specific meaning? We can still relate both properties with RDF. But if we all use derivatives of the constants,
 what is the value of these constants?
  • 59. Authors are not always in control:
 external semantic drift happens. foaf:knows was bidirectional… spec: “some level of reciprocity” An foaf:knows Pete Peter foaf:knows An …until somebody modeled Twitter followers Pete follows Angela Merkel Pete knows Angela Yet Angela doesn’t know Pete…
  • 60. Getting close to Derrida… but we’re not philosophers. There are only two hard things
 in Computer Science:
 cache invalidation and naming things. —Phil Karlton
  • 62. The constants you can touch
 are the constants you can trust. No matter how hard technology changes,
 the books we describe remain the same. Any mechanism of identification
 should based on domain resources,
 not on inevitably changing technology.
  • 63. The “success” story
 of the Web API community. e existence of more than 12.000 di↵erent micro-protocols to achieve essen en clients and servers over http. Of course, each application has its own t does that also warrant an entirely di↵erent way of exposing this, especially Each di↵erent api currently requires a di↵erent client, given the lack of a u pi’s response structure and functionality. Clearly, this approach to Web apis i 2005 2007 2009 2011 2013 2015 186 1,263 2,418 5,018 7,182 10,302 12,559 number of indexed Web s g number of Web apis is often named an indicator of their success, while the ove ssary—and detrimental to the development of generic Web api clients. (data: progra number of indexed Web APIs
 in ProgrammableWeb
  • 64. Just imagine we had
 15,000 different data models. e existence of more than 12.000 di↵erent micro-protocols to achieve essen en clients and servers over http. Of course, each application has its own t does that also warrant an entirely di↵erent way of exposing this, especially Each di↵erent api currently requires a di↵erent client, given the lack of a u pi’s response structure and functionality. Clearly, this approach to Web apis i 2005 2007 2009 2011 2013 2015 186 1,263 2,418 5,018 7,182 10,302 12,559 number of indexed Web s g number of Web apis is often named an indicator of their success, while the ove ssary—and detrimental to the development of generic Web api clients. (data: progra number of indexed Web APIs
 in ProgrammableWeb
  • 65. Find resources in your domain
 and assign them an identifier. http://bib.org/books/978-1-85604-964-1/ http://bib.org/authors/7356/
  • 66. It’s just like building a web site.
 When a user comes, serve HTML. http://bib.org/books/978-1-85604-964-1/ U GET HTML
  • 67. It’s just like building a web site.
 When a client comes, serve JSON. http://bib.org/books/978-1-85604-964-1/ C GET JSON
  • 68. It’s just like building a web site.
 When a client comes, serve RDF. http://bib.org/books/978-1-85604-964-1/ C GET RDF
  • 69. Content negotiation exists
 for a long time in HTTP. http://bib.org/books/978-1-85604-964-1/ C GET RDF Resource Representation
  • 70. This allows constant URIs
 even with future changes. http://bib.org/books/978-1-85604-964-1/ C GET RDF 2.0
  • 71. It enables different users and
 machines to talk about things. http://bib.org/books/978-1-85604-964-1/ C U C
  • 72. The best API is no API. Your website is already an API. Developers like to build complicated APIs. API keys are especially cool to build. Every feature and change comes with a high cost. If you ask for an API, you’ll get one. Ask for new representations
 of your resources instead.
  • 76. The Semantic Web promised
 data on the Web. 85,567,007,302 triples from 3,426 datasets LODStats 38,606,408,765 from 657,896 entries LOD Laundromat
  • 77. How much of this data
 can we readily access? data dumps Linked Data documents SPARQL endpoints
  • 78. A data dump means downloading everything and querying locally.
  • 79. A data dump means downloading everything and querying locally. When was the last time
 you downloaded the full Wikipedia
 just because you had one question?
  • 80. Dumps are not Web querying.
 It’s kind of like giving up. Semantic Web Semantic Basement? What advantage do we have
 compared to Big Data? Still the RDF data model… But the major difference is Web.
  • 81. Linked Data documents
 allow you to traverse a dataset.
  • 82. Linked Data documents
 allow you to traverse a dataset. That’s similar to what we also do:
 consume information on Wikipedia
 by following links.
  • 83. Much Linked Data is available
 using the well-known principles. Servers publish a light-weight interface. Clients follow their nose
 to retrieve information.
  • 84. Linked Data documents allow
 query evaluation on the Web. # Other books by the same author
 SELECT DISTINCT ?book WHERE {
 books:85604 dc:creator ?author.
 ?book dc:creator ?author.
 }
  • 85. Some queries are hard
 or impossible to evaluate. # Books about Hamburg
 SELECT DISTINCT ?book ?author WHERE {
 ?book dc:subject dbpedia:Hamburg.
 ?book dc:creator ?author.
 }
  • 86. SPARQL endpoints allow you
 to ask any question you want.
  • 87. SPARQL endpoints allow you
 to ask any question you want. When was the last time
 you expected Wikipedia to answer
 specific questions automatically for you?
  • 88. A public SPARQL endpoint
 happily answers this query. # Other books by the same author
 SELECT DISTINCT ?book WHERE {
 books:85604 dc:creator ?author.
 ?book dc:creator ?author.
 }
  • 89. A public SPARQL endpoint also
 happily answers this query. # Books about Hamburg
 SELECT DISTINCT ?book ?author WHERE {
 ?book dc:subject dbpedia:Hamburg.
 ?book dc:creator ?author.
 }
  • 90. A public SPARQL endpoint also
 happily answers this query… SELECT DISTINCT ?drug ?drug1 ?drug2 ?drug3 ?drug4 ?d1 WHERE { ?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu- berlin.de/drugbank/resource/drugcategory/antibiotics> . ?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu- berlin.de/drugbank/resource/drugcategory/antiviralAgents> . ?drug3 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu- berlin.de/drugbank/resource/drugcategory/antihypertensiveAgents> . ?drug4 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/drugCategory> <http://www4.wiwiss.fu- berlin.de/drugbank/resource/drugcategory/anti-bacterialAgents> . ?drug1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/hprdId> ?hp1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/swissprotName> ?sn1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/proteinSequence> ?ps1 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/generalReference> ?gr1 . ?drug <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target>?o1 . ?drug2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/target> ?o2 . ?o1 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/genbankIdGene> ?g2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/locus> ?l2 . ?o2 <http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugbank/molecularWeight> ?mw2 .
  • 91. There’s a price to pay for being
 the most expressive HTTP interface. The majority of public SPARQL endpoints
 has less than 95% uptime. This means we cannot query them
 for more than 1.5 days each month. This means we cannot rely on them
 to build Linked Data applications. Buil-Aranda – Hogan – Umbrich – Vandenbussche
 SPARQL Web-Querying Infrastructure: Ready for Action?
  • 93. The main promise of Linked Data
 is integration, preserving semantics. RDF Records in one table can relate to oth- ers by referencing their key column. ent child s property subject object
  • 94. Integration is the promise.
 But does it work on the Web? data dumps Linked Data documents SPARQL endpoints
  • 95. With data dumps, we just
 build a bigger basement. How far do we go? How do we keep data up to date?
  • 96. With Linked Data documents,
 we keep on following our nose. There are no dataset boundaries. Some queries will remain hard.
  • 97. With public SPARQL endpoints,
 problems become worse. 1 endpoint has 95% availability. 1.5 days down each month 2 endpoints have 90% availability. 3 days down each month 3 endpoints have 85% availability. 4.5 days down each month
  • 99. Can we think differently
 about Linked Data on the Web? high server costlow server cost data
 dump SPARQL
 endpoint high availability low availability high bandwidth low bandwidth out-of-date data live data low client costhigh client cost Linked Data
 documents
  • 100. Can we think differently
 about Linked Data on the Web? data
 dump SPARQL
 endpoint Linked Data
 documents ? ?
  • 101. Let us combine the lessons on
 changes, constants, and promises. An interface that withstands change, simple enough so it doesn’t break complex enough to query.
  • 102. Let us combine the lessons on
 changes, constants, and promises. Data dumps contain too much. SPARQL endpoint results are too specific. Linked Data documents are unidirectional.
  • 103. Each interface divides a dataset into Linked Data Fragments. Data dumps: 1 huge fragment SPARQL endpoints: ∞ specific fragments Linked Data: 1 fragment per subject
  • 104. Can we find a new interface
 with a sustainable balance? Triple Pattern Fragments:
 1 fragment per subject / predicate / object
  • 105. Browse a dataset by triple pattern—
 no less, no more.
  • 106. Machines can access
 the exact same interface as RDF.
  • 107. Triple Pattern Fragments extend
 Linked Data documents with forms. That’s even more similar to what we do:
 consume information on the Wikipedia
 by following links and using forms.
  • 108. Machines solve complex queries
 by breaking them down. # Other books by the same author
 SELECT DISTINCT ?book WHERE {
 books:85604 dc:creator ?author.
 ?book dc:creator ?author.
 }
  • 109. Machines solve complex queries
 by breaking them down. # Books about Hamburg
 SELECT DISTINCT ?book ?author WHERE {
 ?book dc:subject dbpedia:Hamburg.
 ?book dc:creator ?author.
 }
  • 110. Promises can be kept, because
 the interface is intelligently light. Publishing Linked Data
 that can be queried on the Web
 is realistic because the workload is divided. The server doesn’t even need a triplestore. Since the client is in charge,
 querying multiple sources is easy.
  • 111. Promises are negotiated contracts so they always involve trade-offs. Querying will be slower. clients send many requests to answer a query Query times are more consistent. 0.3 secs with a SPARQL endpoint… 95% of time 3 secs with Triple Pattern Fragments… 99.9% of time Experiment with more complex interfaces.
  • 112. Make your Linked Data
 queryable on the Web. Several open-source implementations:
 linkeddatafragments.org/software/ Query one or multiple sources online:
 client.linkeddatafragments.org Example: bit.ly/harvard-hamburg
  • 114. Identify the constants,
 separate them from changes. Satisfy Linked Data needs
 with promises you can keep.
  • 115. Simple enough
 to be usable, complex enough
 to be useful.
  • 116. Sustainability means
 promising the simplest
 useful complexity.