3. what about IRIs and RDF
a new way to publish data on the web
ids are ambiguous and suck!
Use URIs
as names for things
Use HTTP URIs
so that people can look up those names
Use the standards (RDF, SPARQL)
providing useful information
Include links to other URIs
so that they can discover more things
linked data principles
Tim Berners-Lee
July 27, 2006
4. The Children and Families Act 2014
http://www.legislation.gov.uk/id/uksi/2014/2270
what about IRIs and RDF
turning documents into data
ids are ambiguous and suck!
5. A new way to design
databases
RDF
(aka ’define knowledge’)
6. Go Triples, go!
the standard (old) approach
ID_P COGNOME NOME REF_ID_SOCIETA GENERE
1 Camarda Diego 1 maschio
2 … … … …
ID_SOCIETA DENOMINAZIONE SITO
1 Regesta.exe srl www.regesta.com
7. Go Triples, go!
the new (cool) approach
<http://www.regesta.com/diego>Subject
8. Go Triples, go!
the new (cool) approach
<http://www.regesta.com/diego>
<http://xmlns.com/foaf/0.1/familyName>
Subject
Predicate
9. Go Triples, go!
the new (cool) approach
<http://www.regesta.com/diego>
<http://xmlns.com/foaf/0.1/familyName>
‘Camarda’.
Subject
Predicate
Object
10. Go Triples, go!
the new (cool) approach
<http://www.regesta.com/diego>
<http://xmlns.com/foaf/0.1/familyName> ‘Camarda’.
<http://www.regesta.com/diego>
<http://xmlns.com/foaf/0.1/firstName> ‘Diego’.
<http://www.regesta.com/diego>
<http://xmlns.com/foaf/0.1/gender> ‘male’.
11. Go Triples, go!
the new (cool) approach
<http://www.regesta.com/diego>
<http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ;
<http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ;
<http://xmlns.com/foaf/0.1/gender> ‘male’ .
31. The Resource Description Framework
is a general-purpose language for representing
information in the Web.
It's time for a new standard
RDF
32. The SPARQL Protocol and RDF Query Language
is a query language and protocol for RDF.
It's time for a new standard
SPARQL
33. On the Semantic Web, vocabularies define
the concepts and relationships
(also referred to as “terms”)
used to describe and represent
an area of concern.
It's time for a new standard
Ontologies
34. PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
foaf:firstName
dc:title
rdfs:label
Pre:fixes (ontologies)
just a few words
37. Resource Description
Framework
› SPARQL endpoint
› dereferenceable URIs
› content negotiation
› standards port, like 80 (HTTP)
› JSONP support
› up-to-date
› the endpoint URL is easy to deduce from resources
› the resources are described by dc:title or rdfs:label
› the endpoint hosts a page for humans
› the resources and the endpoint are on the same domain
SHOULD!
(please do it, for me)
46. DISTINCT, COUNT
GRAPH, PREFIX
isBlank, isIRI, isLiteral, isNumeric
FILTER, REGEX, STR
FILTER NOT EXISTS, MINUS
ORDER BY, OFFSET, LIMIT
for other stuff
http://www.w3.org/TR/sparql11-query/
SPARQL
minimum requirements
47. Please start negotiating content
right now!
Hi dude, I accept:
text/html,application/xhtml+xml
Html
pageGreat! I’ll serve you a web page
Hi dude, I accept:
application/rdf+xml
RDF
dataGreat… 302, redirect!
Hi dude, I accept:
pizza/margherita
406
errormmm… sorry
52. It’s slow
so keep calm
1 record 15 triples
2.949.771 votes 64.948.856 triples
usually
eg. Chamber of deputies
data big data
RDF probably will transform
53. Virtuoso
Sesame
Fuseki (Jena)
Owlim / Bigdata (Sesame)
AllegroGraph
D2R server
ARC2
…
Triplestores
I just need a SPARQL endpoint
I just really need http://yourdomain/sparql
55. select distinct ?o where {?s a ?o}
select ?o count(distinct ?s) where {?s a ?o}
select count(?s) where {?s ?p ?o}
select count(?s) ?class where {?s ?p ?o; a ?class}
select distinct ?p where {?s a <http://classe>; ?p ?o}
select ?p count(?p) where {?s a <http://classe>; ?p ?o}
select ?s where {?s a <http://classe>}
?p ?o where {<http://URI> ?p ?o}
select distinct ?s ?title where {?s a <http://classe>;
dc:title ?title. FILTER(REGEX(? title,’parola’,’i’))} LIMIT 100
SPARQL magic
a query for all seasons
58. All Bills filtered by year
SELECT DISTINCT * {?bill a ocd:atto; dc:title ?title; dc:date ?date .
FILTER(regex(?date,'^2014'))} ORDER BY ?date
Last voted Bills
SELECT distinct * WHERE {
?bill a ocd:atto; dc:title ?title.
?votazione a ocd:votazione; ocd:rif_attoCamera ?bill; dc:date ?data; dc:title ?denominazione;
dc:description ?descrizione; ocd:votanti ?votanti; ocd:votazioneFinale 1; ocd:favorevoli
?favorevoli; ocd:contrari ?contrari; ocd:astenuti ?astenuti;
ocd:rif_leg <http://dati.camera.it/ocd/legislatura.rdf/repubblica_17>}
ORDER BY DESC(?data)
Example queries
Chamber of deputies
59. All Bills filtered by year
PREFIX osr: <http://dati.senato.it/osr/>
SELECT DISTINCT * {?bill a osr:Ddl; osr:titolo ?title; osr:dataPresentazione ?date .
FILTER(regex(STR(?date),'^2014'))} ORDER BY ASC(?date)
Last approved Bills
PREFIX osr: <http://dati.senato.it/osr/>
SELECT DISTINCT ?ddl ?titolo ?titoloBreve ?natura ?stato ?dataApprovato
WHERE { ?ddl a osr:Ddl. ?ddl osr:statoDdl ?stato.
?ddl osr:ramo "S"^^<http://www.w3.org/2001/XMLSchema#string>.
?ddl osr:dataPresentazione ?dataPresentazione. ?ddl osr:titolo ?titolo.
OPTIONAL { ?ddl osr:titoloBreve ?titoloBreve }. ?ddl osr:natura ?natura.
?ddl osr:dataStatoDdl ?dataApprovato. ?ddl osr:testoApprovato ?testoApprovato
FILTER(xsd:date(str(?dataApprovato)) <= xsd:date(str("2014-12-31")))
FILTER(xsd:date(str(?dataApprovato)) >= xsd:date(str("2014-01-01")))
} ORDER BY ?dataApprovato
Example queries
Senate of Republic
62. All ‘Works’ filtered by year
SELECT ?work ?date ?title {?work a frbr:Work . ?work dct:title ?title . ?work dct:created ?date .
FILTER (REGEX(STR(?date),'^2014')) } ORDER BY desc(?date)
Top subjects by year
SELECT (count(?sub) as ?tot) ?sub { ?work a frbr:Work . ?work dct:subject ?sub . ?work
dct:created ?date . FILTER (REGEX(STR(?date),'^2014')) } GROUP BY ?sub ORDER BY desc(?tot)
LIMIT 100
Example queries
64. W3C standards
http://www.w3.org/standards/semanticweb/
OKFN endpoints status (and list)
http://sparqles.okfn.org
LodLive (a SPRQL navigator)
http://en.lodlive.it
a very good intro to RDF
https://github.com/JoshData/rdfabout/blob/gh-pages/intro-to-rdf.md
Tim Berners-Lee’s “Linked Data – 5 stars ranking”
http://www.w3.org/DesignIssues/LinkedData.html
My github page
http://github.com/dvcama
My email
mailto:diego.camarda@regesta.com