1. Semantic Web
Languages
Rinke Hoekstra
r.j.hoekstra@vu.nl
Universiteit van Amsterdam
Vrije Universiteit Amsterdam
2. Overview
• The Basics
• Resource Description Framework (RDF)
• RDF Vocabulary Description Language (RDF Schema)
• Simple Knowledge Organisation System (SKOS)
• SPARQL Query Language for RDF
3. Linked Data Audio LOV
Linked
User Slideshare tags2con
delicious
Feedback 2RDF
Moseley Scrobbler Bricklink Sussex
Folk (DBTune) Reading St.
GTAA
Magna- Lists Andrews
Klapp-
tune stuhl- Resource NTU
DB club Lists Resource
Tropes Lotico Semantic yovisto
John Music Man- Lists
Music Tweet chester
Hellenic Peel Brainz NDL
(DBTune) (Data Brainz Reading
subjects
FBD (zitgist) Lists Open
EUTC Incubator) Linked
Hellenic Library Open t4gm
Produc- Crunch-
PD Surge RDF info
tions
Discogs base Library
Radio Ontos Source Code
Crime ohloh Plymouth (Talis)
(Data News LEM
Ecosystem Reading RAMEAU
Reports business Incubator)
Crime data.gov. Portal Linked Data Lists SH
UK Music Jamendo
(En- uk
Brainz (DBtune) LinkedL
Ox AKTing) FanHubz gnoss ntnusc
(DBTune) SSW CCN
Points Thesau-
Last.FM Poké- Thesaur
Popula- artists pédia Didactal us rus W LIBRIS
tion (En- (DBTune) Last.FM ia theses. LCSH Rådata
reegle research patents MARC
AKTing) (rdfize) my fr nå!
data.gov. data.go Codes
Ren.
NHS uk v.uk Good- Experi-
Classical List
Energy (En- win flickr ment
(DB Pokedex Norwe-
Genera- AKTing) Mortality BBC Family wrappr Sudoc PSH
Tune) gian
(En-
tors Program MeSH
AKTing) semantic
mes BBC IdRef GND
CO2 educatio OpenEI web.org SW
Energy Sudoc ndlna
Emission n.data.g Music Dog VIAF
EEA (En- Chronic- Linked
(En- ov.uk Portu- Food UB
AKTing) ling Event MDB
AKTing) guese Mann- Europeana
BBC America Media
DBpedia Calames heim
Ord- Recht- Wildlife Deutsche
Open Revyu DDC
Openly spraak. Finder Bio- lobid
Election nance
legislation Local nl RDF graphie
Resources NSZL Swedish
Data Survey Tele- data Ulm
EU New Book
Project data.gov.uk graphis bnf.fr Catalog Open
Insti- York Open Mashup Cultural
tutions Times URI Greek P20
UK Post- Burner Calais Heritage
codes DBpedia ECS Wiki
statistics lobid
GovWILD data.gov. Taxon iServe South- Organi-
LOIUS BNB
Brazilian
uk Concept ECS ampton sations
Geo World OS BibBase STW GESIS
Poli- ESD South- ECS
Names Fact- (RKB
ticians stan- reference ampton
data.gov.uk book Freebase Explorer) Budapest
dards data.gov. NASA EPrints
uk intervals Project OAI
Lichfield transport (Data DBpedia data
Guten- Pisa
Spen- data.gov. Incu- dcs RESEX Scholaro-
ISTAT ding bator) Fishes berg DBLP DBLP
uk Geo
meter
Immi- Scotland of Texas (FU (L3S)
Pupils & Uberblic DBLP
gration Species Berlin) IRIT
Exams Euro- dbpedia data- (RKB
London TCM ACM
stat lite open- Explorer) NVD
Gazette (FUB) Gene IBM
Traffic Geo ac-uk
Scotland TWC LOGD Eurostat Daily DIT
Linked UN/
Data UMBEL Med ERA
Data LOCODE DEPLOY
Gov.ie CORDIS YAGO New-
lingvoj Disea-
(RKB some SIDER RAE2001 castle LOCAH
CORDIS Explorer) Linked Eurécom
Eurostat Drug CiteSeer Roma
(FUB) Sensor Data
GovTrack (Ontology (Kno.e.sis) Open Bank Pfam Course-
Central) riese Enipedia
Cyc Lexvo LinkedCT ware
Linked PDB
UniProt VIVO
EURES EDGAR dotAC
US SEC Indiana ePrints IEEE
(Ontology totl.net
(rdfabout)
Central) WordNet RISKS
(VUA) Taxono UniProt
US Census EUNIS Twarql HGNC
Semantic Cornetto (Bio2RDF)
(rdfabout) my VIVO
FTS XBRL PRO- ProDom STITCH Cornell LAAS
SITE KISTI NSF
Scotland
Geo- GeoWord LODE
graphy Net WordNet WordNet JISC
(W3C) (RKB
Climbing
Linked Affy- KEGG
SMC Explorer) SISVU Pub VIVO UF
Piedmont GeoData metrix Drug
ECCO-
Finnish Journals PubMed Gene SGD Chem
Munici-
Accomo- El AGROV Ontology TCP Media
dations Alpine bible
palities Viajero OC
Ski ontology
Tourism KEGG
Ocean
Austria
Enzyme PBAC Geographic
Metoffice GEMET ChEMBL
Italian Drilling OMIM KEGG
Weather Open
public Codices AEMET Linked MGI Pathway
schools Forecasts
Data
Open InterPro GeneID Publications
EARTh Thesau- KEGG
Turismo
rus Colors Reaction
de
Zaragoza Product Smart KEGG
User-generated content
Weather DB Link Medi Glycan
Janus Stations Product Care KEGG
AMP UniParc UniRef UniSTS Government
Types Italian
Homolo Com-
Yahoo! Airports Museums pound
Ontology Google
Gene
Geo Art
Planet National
wrapper
Chem2 Cross-domain
Radio- Bio2RDF
activity UniPath
JP Sears Open Linked OGOLOD way
Life sciences
Corpo- Amster- Reactome
dam medu- Open
rates Numbers
Museum cator
As of September 2011
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
4. The Basics
• Layer Knowledge Representation technology
• on top of standard Web technology
• Globally unique identifiers
• Identifiers follow the HTTP URI syntax (RFC 3305)
• They identify web resources
• Identifiers may be used as locators (URL) to retrieve a
representation of the resource via HTTP
• Identifiers can be abbreviated using namespace prefixes
5. The Basics
• Layer Knowledge Representation technology
• on top of standard Web technology
• Globally unique identifiers
• Identifiers follow the HTTP URI syntax (RFC 3305)
• They identify web resources
• Identifiers may be used as locators (URL) to retrieve a
representation of the resource via HTTP
• Identifiers can be abbreviated using namespace prefixes
6. (Namespaces)
• A namespace is a set of “names” for resources, that
• Have a “meaningful” overlap in their URIs, e.g.:
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/1999/02/22-rdf-syntax-ns#Property
• They can be abbreviated using a prefix, e.g.:
rdf:type
rdf:Property
• Hash (#) namespaces and slash (/) namespaces
• The default namespace has no prefix
7. HTML vs RDF
http://foo.bar/page1 http://foo.bar/page2
• HTML HTML HTML
page page
<a href="http://foo.bar/page2">foo</a>
http://foo.bar/resource1 http://foo.bar/resource2
• RDF http://foo.bar/resource3
RDF RDF
resource resource
RDF
resource
(NB: clients retrieve an RDF document that describes the resource)
8. RDF
• The Resource Description Framework (1999)
• Data model is a directed labeled graph
• Formal semantics for reliable rules of inference
... but open world assumption (OWA)
subject predicate object
• Every arc is a statement in the language, where
• an edge is a predicate, and has the type rdf:Property
• the connected nodes are subject and object
• RDF graphs are serialised as collections of triples
10. Language Elements
• Most resources have a URI as identifier, but
• blank nodes only have a name local to the graph
• literal values (e.g. strings) are their own identifiers
• Collections are lists for which all members are known
• Containers are lists for which not all members are known statement
rdf:predicate
• Statements are reifications of triples rdf:subject rdf:object
subject predicate object
• Reserved words (see RDF Schema):
rdf:type, rdf:Property, rdf:List, rdf:Bag, rdf:Seq, rdf:Alt,
rdf:Statement, rdf:subject, rdf:predicate, rdf:object, rdf:value
16. RDF Schema
• RDF Vocabulary Description Language
• More inference (whooh!)
• Represent classes and subclasses
• Represent subproperties, domain and range
• More reserved words for RDF
rdfs:Resource, rdfs:Class, rdfs:Literal,
rdfs:subClassOf, rdfs:subPropertyOf, rdfs:domain,
rdfs:range, rdfs:label, rdfs:comment
17. RDF Schema
• RDF Vocabulary Description Language rdfs:range
rdfs:subClassOf rdfs:Datatype
rdfs:domain
• More inference (whooh!)
rdf:type
rdf:type
rdfs:subClassOf
rdf:type rdf:type
rdf:type
• Represent classes and subclasses
rdfs:Class
rdfs:subPropertyOf rdf:type
rdf:type
rdf:Property
rdf:type rdfs:subClassOf
rdfs:subClassOf rdf:type
• Represent subproperties, domain and range
rdf:type
rdf:type rdfs:Resource
rdfs:subClassOf
rdfs:label rdfs:comme rdfs:Literal
• More reserved words for RDF
nt
rdfs:Resource, rdfs:Class, rdfs:Literal,
rdfs:subClassOf, rdfs:subPropertyOf, rdfs:domain,
rdfs:range, rdfs:label, rdfs:comment
18. Classes
• Subsumption hierarchies in RDF Schema
• Every resource of type rdfs:Class is a set of
resources
• Every rdfs:subClassOf such a class is a subset of
those resources
• Every resource that has the class as its rdf:type,is
also an instance of its superclasses.
19. Properties
• Every resource of type rdf:Property can be used to link pairs of
resources
• Every rdfs:subPropertyOf such a property links a subset of those
pairs
• Every asserted triple that uses a property as predicate, is also a
member of the set of pairs of its superproperties.
• If a property has a specified class as rdfs:range, all objects in triples
that have the property as predicate, are a member of that class.
• If a property has a specified class as rdfs:domain, all subjects in
triples that have the property as predicate, are a member of that class.
20. xpy p rdf:type rdf:Property
Semantics
xpl
where l a plain literal
p rdfs:domain x
l rdf:type rdfs:Literal
z rdf:type x
ypz
p rdfs:range x y rdf:type x
ypz
p1 rdfs:subPropertyOf p2 p1 rdfs:subPropertyOf p3
p2 rdfs:subPropertyOf p3
p rdf:type rdf:Property p rdfs:subPropertyOf p
p1 rdfs:subPropertyOf p2 x p2 y
x p1 y
x rdf:type rdfs:Class x rdfs:subClassOf rdfs:Resource
x rdf:type rdfs:Class x rdfs:subClassOf x
x rdfs:subClassOf y z rdf:type y
z rdf:type x
x rdfs:subClassOf y x rdfs:subClassOf z
y rdfs:subClassOf z
x rdf:type rdfs:Datatype x rdfs:subClassOf rdfs:Literal
23. SKOS
• Simple Knowledge Organization System
• Origins in library science (“KOS”)
• Classification and taxonomy
• Not always useful to think of classes and instances
• Concept description language
• No formal semantics, other than RDF & RDFS
24. SKOS
• Main notion is skos:Concept, the class of all concepts
• Concepts are grouped in skos:ConceptSchemes
• Intransitive narrower and broader relations:
skos:broader and skos:narrower (+ transitive super properties)
• Property for relating concepts:
skos:related
• Properties for labeling concepts:
skos:prefLabel, skos:altLabel
• Properties for matches between concepts in different schemes:
skos:closeMatch, skos:exactMatch, skos:relatedMatch,
skos:broadMatch, skos:narrowMatch
25. SPARQL
• RDF is often stored in a database (“Triple Store”)
• Standard RDF Query Language
• SPARQL 1.1 is on its way
• Standard RDF Query Protocol (“SPARQL Endpoint”)
• How to send a query over HTTP?
• How to respond over HTTP?
26. SPARQL Syntax
• A select-from-where inspired syntax (like SQL)
• Select the resources (variables) you want to return:
SELECT ?person
• From the named RDF graph:
FROM <http://www.siks.nl/swcourse/example>
• Where the pattern matches the RDF graph:
WHERE {?person :age “34” .}
• Including additional constraints on objects, using operators:
WHERE {?person :age ?age . FILTER(?age > 30) }
27. SPARQL Syntax
• A select-from-where inspired syntax (like SQL)
PREFIX : <http://www.siks.nl/swcourse/
• Select the resources (variables) you want to return:
example>
SELECT ?person
SELECT ?person
• From the named RDF graph:
FROM <http://www.siks.nl/swcourse/
FROM <http://www.siks.nl/swcourse/example>
example>
WHERE {
• Where ?person :age ?age . RDF graph:
the pattern matches the
WHEREFILTER(?age > “34” .}
{?person :age 30)
}
• Including additional constraints on objects, using operators:
WHERE {?person :age ?age . FILTER(?age > 30) }
28. Graph Patterns
• WHERE clause specifies graph pattern
• pattern should match
• pattern can match more than once
• Graph pattern:
foaf:Person "Rinke Hoekstra"
rdf:type ?p
• an RDF graph ?x foaf:knows ?y
• with some nodes or edges as variables
29. Triple Patterns
• Triples with one or more variable
• Multiple triple patterns per graph pattern
foaf:Person "Rinke Hoekstra"
• Turtle syntax, e.g. :
?x
rdf:type
foaf:Person
rdf:type
foaf:name
?x
foaf:knows
?y
?x
foaf:name
“Rinke Hoekstra” :laura foaf:knows :rinke
?x
?p
?y
31. Alternative Graphs
• Use UNION to define a pattern with multiple graphs,
• at least one should match. "Willem van Hage"
foaf:name
:willem
PREFIX rdf:
<http://www.w3.org/1999/02/22-rdf-syntax- rdf:type
ns#>. foaf:Person
PREFIX foaf:
<http://xmlns.com/foaf/0.1/>. rdf:type
foaf:knows
PREFIX :
<http://www.siks.nl/swcourse/example/>.
:laura foaf:name "Laura Hollink"
rdf:type
SELECT ?person foaf:knows :friends
rdf:_1
FROM <http://www.siks.nl/swcourse/example/>
WHERE { :rinke rdf:_2
{?person
foaf:knows
:rinke .} _:bn01
UNION
foaf:name rdf:type
rdf:Bag
{?person
foaf:knows
:laura .} "Rinke Hoekstra"
}
32. Optional Graphs
• RDF is semi structured
• Even if the schema says some object can have a
particular property, it may not always be present in
the data.
• Use OPTIONAL for parts of the graph that need not
match
:rinke
foaf:knows
:laura
foaf:name
"Laura Hollink"
:rinke foaf:name "Rinke Hoekstra"
33. Optional Graphs
• RDF is semi structured
• Even if the schema says some object can have a
particular property, it may not always be present in
PREFIX data. <http://www.w3.org/1999/02/22-rdf-syntax-
the rdf:
ns#>.
PREFIX foaf:
<http://xmlns.com/foaf/0.1/>.
•PREFIX :
OPTIONAL for parts of the graph that need
Use
<http://www.siks.nl/swcourse/example/>. not
match
SELECT ?person ?name ?friend
FROM <http://www.siks.nl/swcourse/example/>
WHERE { :rinke
?person
a
foaf:knows
foaf:Person ;
foaf:name
?name .
:laura
OPTIONAL { ?person
foaf:name
foaf:knows ?friend . }
"Laura Hollink"
}
:rinke foaf:name "Rinke Hoekstra"
34. Testing Values
• The FILTER clause has to be validated for every graph that
matches the query pattern.
• RDF model related operators
isLiteral(?node), isURI(?node), str(?resource)
• Comparison operators
?x <= ?y, ?z < 20, ?z = ?y, etc.
• Arithmetic operators
?x + ?y, etc.
• String matching using regular expressions
REGEX(?x, “hoekstra”, “i”) matches “Rinke Hoekstra”
35. Testing Values
• Checking whether a variable is bound
bound(?x)
• Checking whether a pattern exists (SPARQL 1.1)
NOT EXISTS and EXISTS
• Boolean combinations of these test expressions
&& (and), || (or), ! (not)
36. Testing Values
• Checking whether a variable is bound
bound(?x)
PREFIX rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-
ns#>.
PREFIX foaf:
<http://xmlns.com/foaf/0.1/>.
• Checking whether a pattern exists (SPARQL
PREFIX :
<http://www.siks.nl/swcourse/example/>.1.1)
NOT EXISTS and EXISTS
SELECT ?person ?name ?friend
FROM <http://www.siks.nl/swcourse/example/>
WHERE {
•
Boolean combinations of these test expressions
?person
a
foaf:Person ;
foaf:name
?name .
&& (and), || (or), ! (not)
OPTIONAL { ?person
foaf:knows ?friend . }
FILTER ( REGEX(?name, “hoekstra”, “i”) && !bound(?friend) )
}
37. Testing Values
• Checking whether a variable is bound
bound(?x)
PREFIX rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-
PREFIX rdf:
<http://www.w3.org/1999/02/22-rdf-syntax-
ns#>.
ns#>.
PREFIX foaf:
<http://xmlns.com/foaf/0.1/>.
PREFIX foaf:
<http://xmlns.com/foaf/0.1/>.
• Checking whether a pattern exists (SPARQL
PREFIX :
:
<http://www.siks.nl/swcourse/example/>.1.1)
PREFIX <http://www.siks.nl/swcourse/example/>.
NOT EXISTS and EXISTS
SELECT ?person ?name ?friend
SELECT ?person ?name ?friend
FROM <http://www.siks.nl/swcourse/example/>
FROM <http://www.siks.nl/swcourse/example/>
WHERE { {
• WHERE
Boolean combinations of these test expressions
?person
a
foaf:Person ; ;
?person
a
foaf:Person
foaf:name
?name . .
&& (and), || (or), ?name
foaf:name
! (not)
OPTIONAL { ?person
{ ?person
foaf:knows. ?friend . }
FILTER NOT EXISTS foaf:knows ?friend }
FILTER ( (REGEX(?name, “hoekstra”, “i”) && !bound(?friend) )
FILTER REGEX(?name, “hoekstra”, “i”) )
}}
38. Paths and
Assignment
• Property paths
?x foaf:knows/foaf:knows/foaf:name ?y
?x foaf:knows{2}/foaf:name ?y
?x foaf:knows*/foaf:name ?y
?x foaf:knows/^foaf:knows ?y
(NB: check that ?x != ?y, using a FILTER)
• Assign values to a variable using BIND:
BIND (?today - ?birthdate AS ?age)
39. Aggregate Functions
• Compute and assign values to variables in SELECT
clause
COUNT, SUM, MIN, MAX, AVG, GROUP_CONCAT,
and SAMPLE
SELECT (SUM(?price) AS ?totalPrice) WHERE {...
40. Solution Modifiers
• Ordering of the result set using ORDER BY
• Grouping of the result set using GROUP BY
(aggregate functions are scoped by groups)
• Works for both literal values and resources
SELECT ?person ?name ?friend
FROM <http://www.siks.nl/swcourse/example/>
WHERE {
?person
a
foaf:Person ;
foaf:name
?name .
OPTIONAL { ?person
foaf:knows ?friend . }
} ORDER BY ASC(?name)
41. Solution Modifiers
• Remove duplicate results
DISTINCT and REDUCED
• Limit or offset number of results
LIMIT, OFFSET (NB: results must be ordered)
SELECT DISTINCT ?person ?name ?friend
WHERE {
?person
a
foaf:Person ;
foaf:name
?name .
OPTIONAL { ?person
foaf:knows ?friend . }
} ORDER BY ASC(?name)
LIMIT 10
OFFSET 10
45. Query Types
• SELECT queries return variable bindings
• CONSTRUCT queries return an RDF graph
• ASK queries return yes (or no) if the graph pattern
does (or does not) exist in the store.
ASK <uri>
46. Query Types
• SELECT queries return variable bindings
• CONSTRUCT queries return an RDF graph
• ASK queries return yes (or no) if the graph pattern
does (or does not) exist in the store.
• DESCRIBE queries return a Concise Bounded
Description:
• An RDF graph consisting of all triples in which the
specified resource is an object
DESCRIBE <uri>
47. Query Types
• SELECT queries return variable bindings
• CONSTRUCT queries return an RDF graph
• ASK queries return yes (or no) if the graph pattern
does (or does not) exist in the store.
• DESCRIBE queries return a Concise Bounded
Description:
• An RDF graph consisting of all triples in which the
specified resource is an object
48. Why CONSTRUCT
• Sometimes we need
• Statements from original RDF graph:
data extraction
• New statements derived from original data:
data conversion, views over data
49. Result Formats
• DESCRIBE and CONSTRUCT queries:
RDF/XML, Turtle, N3
• ASK and SELECT queries:
XML, RDF/XML, JSON, TEXT, sometimes HTML
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
<head>
<variable name="x"/>
<variable name="hpage"/>
</head>
<results>
<result>
<binding name=”x">
<literal datatype=”…/XMLSchema#integer">30</literal>
</binding>
<binding name="hpage">
<uri>http://work.example.org/bob/</uri>
</binding>
</result>
</results>
</sparql>
50. Discussion
• URIs enable global data integration and enrichment
• RDF is the data model (it’s all graphs!)
• RDF Schema is the vocabulary language
• SKOS is the concept description language
• SPARQL is the query language