2. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
3. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
4. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
5. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
6. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
7. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
8. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
9. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
10. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
11. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
12. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address>
</person>
How many pizzas were delivered to
Nuno’s home during his PhD?
2012
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
1 / 24
13. Using Query Languages for Data Integration
SQL
to Nuno’s home during his PhD?
select address from clients
How many pizzas were delivered
Mediator / Data Warehouse
where person = "Nuno"
relation
SPARQL
select $lat $long
from <geonames.org/..>
where { $point geo:lat $lat;
geo:long $long }
solution sequence
XQuery
for $person in doc("eat.ie/..")
return $person//name
sequence of items
2 / 24
14. Using Query Languages for Data Integration
SQL
to Nuno’s home during his PhD?
select address from clients
How many pizzas were delivered
Mediator / Data Warehouse
where person = "Nuno"
relation
SPARQL
select $lat $long
from <geonames.org/..>
where { $point geo:lat $lat;
geo:long $long }
solution sequence
XQuery
for $person in doc("eat.ie/..")
return $person//name
sequence of items
2 / 24
15. Using Query Languages for Data Integration
SQL
to Nuno’s home during his PhD?
select address from clients
How many pizzas were delivered
Mediator / Data Warehouse
where person = "Nuno"
relation
SPARQL
select $lat $long
from <geonames.org/..>
where { $point geo:lat $lat;
geo:long $long }
solution sequence
XQuery
for $person in doc("eat.ie/..")
return $person//name
sequence of items
2 / 24
16. Using Query Languages for Data Integration
SQL
to Nuno’s home during his PhD?
select address from clients
How many pizzas were delivered
Mediator / Data Warehouse
where person = "Nuno"
relation
SPARQL
select $lat $long
from <geonames.org/..>
where { $point geo:lat $lat;
geo:long $long }
solution sequence
XQuery
for $person in doc("eat.ie/..")
return $person//name
sequence of items
2 / 24
17. Using Query Languages for Data Integration
SQL
to Nuno’s home during his PhD?
select address from clients
How many pizzas were delivered
Mediator / Data Warehouse
where person = "Nuno"
relation
SPARQL
select $lat $long
from <geonames.org/..>
where { $point geo:lat $lat;
geo:long $long }
solution sequence
XQuery
for $person in doc("eat.ie/..")
return $person//name
sequence of items
2 / 24
18. Using Query Languages for Data Integration
SQL
to Nuno’s home during his PhD?
select address from clients
How many pizzas were delivered
Mediator / Data Warehouse
where person = "Nuno"
relation
SPARQL
select $lat $long
from <geonames.org/..>
where { $point geo:lat $lat;
geo:long $long }
solution sequence
XQuery
for $person in doc("eat.ie/..")
return $person//name
sequence of items
2 / 24
19. How to represent context information in RDF?
RDF triples
:nuno :address :Galway .
:nuno :address :Dublin .
3 / 24
20. How to represent context information in RDF?
RDF triples
:nuno :address :Galway .
:nuno :address :Dublin . Not enough!
3 / 24
21. How to represent context information in RDF?
RDF triples
:nuno :address :Galway .
:nuno :address :Dublin . Not enough!
Domain vocabulary/ontology
:address1 a :AddressRecord;
:person :nuno;
:address :Galway;
:start "2008" ;
:end "2012" .
3 / 24
22. How to represent context information in RDF?
RDF triples
:nuno :address :Galway .
:nuno :address :Dublin . Not enough!
Domain vocabulary/ontology
:address1 a :AddressRecord;
:person :nuno;
:address :Galway;
:start "2008" ;
:end "2012" .
Reification
:address1 rdf:type rdf:Statement
rdf:subject :nuno;
rdf:predicate :address ;
rdf:object :Dublin ;
:loc <http://eat.ie/> .
3 / 24
23. How to represent context information in RDF?
RDF triples
:nuno :address :Galway .
:nuno :address :Dublin . Not enough!
Domain vocabulary/ontology
:address1 a :AddressRecord;
:person :nuno;
:address :Galway;
:start "2008" ;
:end "2012" .
Reification
:address1 rdf:type rdf:Statement
rdf:subject :nuno;
rdf:predicate :address ;
rdf:object :Dublin ;
:loc <http://eat.ie/> .
Named Graphs
3 / 24
24. How to represent context information in RDF?
RDF triples
:nuno :address :Galway .
:nuno :address :Dublin . Not enough!
Domain vocabulary/ontology
:address1 a :AddressRecord;
:person :nuno;
:address :Galway;
No defined
:start "2008" ; semantics!
:end "2012" .
Reification
:address1 rdf:type rdf:Statement
rdf:subject :nuno;
rdf:predicate :address ;
No defined
rdf:object :Dublin ; semantics!
:loc <http://eat.ie/> .
Named Graphs
3 / 24
25. Hypothesis and Research Questions
Hypothesis
Efficient data integration over heterogenous data sources can be
achieved by
4 / 24
26. Hypothesis and Research Questions
Hypothesis
Efficient data integration over heterogenous data sources can be
achieved by
a combined query language that accesses heterogenous data in
its original sources
4 / 24
27. Hypothesis and Research Questions
Hypothesis
Efficient data integration over heterogenous data sources can be
achieved by
a combined query language that accesses heterogenous data in
its original sources
optimisations for efficient query evaluation for this language
4 / 24
28. Hypothesis and Research Questions
Hypothesis
Efficient data integration over heterogenous data sources can be
achieved by
a combined query language that accesses heterogenous data in
its original sources
optimisations for efficient query evaluation for this language
an RDF-based format with support for context information
4 / 24
29. Hypothesis and Research Questions
Hypothesis
Efficient data integration over heterogenous data sources can be
achieved by
a combined query language that accesses heterogenous data in
its original sources
optimisations for efficient query evaluation for this language
an RDF-based format with support for context information
Research Questions
How to design a query language that bridges the different
formats?
4 / 24
30. Hypothesis and Research Questions
Hypothesis
Efficient data integration over heterogenous data sources can be
achieved by
a combined query language that accesses heterogenous data in
its original sources
optimisations for efficient query evaluation for this language
an RDF-based format with support for context information
Research Questions
How to design a query language that bridges the different
formats?
Can we reuse existing optimisations from other query languages?
4 / 24
31. Hypothesis and Research Questions
Hypothesis
Efficient data integration over heterogenous data sources can be
achieved by
a combined query language that accesses heterogenous data in
its original sources
optimisations for efficient query evaluation for this language
an RDF-based format with support for context information
Research Questions
How to design a query language that bridges the different
formats?
Can we reuse existing optimisations from other query languages?
How to adapt RDF and SPARQL for context information?
4 / 24
35. XSPARQL
Transformation language between RDB,
X SPA R QL XML, and RDF
M D
L F
5 / 24
36. XSPARQL
Transformation language between RDB,
X SPA R QL XML, and RDF
M D
L F Syntactic extension of XQuery
5 / 24
37. XSPARQL
Transformation language between RDB,
X SPA R QL XML, and RDF
M D
L F Syntactic extension of XQuery
Semantics based on XQuery’s semantics
5 / 24
38. XSPARQL
Transformation language between RDB,
X SPA R QL XML, and RDF
M D
L F Syntactic extension of XQuery
Semantics based on XQuery’s semantics
Why based on XQuery?
Expressive language
Use as scripting language
5 / 24
39. XSPARQL
Transformation language between RDB,
X SPA R QL XML, and RDF
M D
L F Syntactic extension of XQuery
Semantics based on XQuery’s semantics
Why based on XQuery?
Expressive language
Use as scripting language
Arbitrary Nesting of expressions
5 / 24
40. Same Language for each Format
for var in Expr
let var := Expr
for SelectSpec
varlist
from RelationList
DatasetClause
where Expr
order by Expr }
{ pattern
where WhereSpecList
return Expr
return Expr
XSPARQL
XSPARQL
for $lat $long doc("eat.ie/..")
for $person in from <geonames.org/..>
for address as $address from clients
where {$person//name $lat;
$point geo:lat
returnperson = "Nuno"
where
geo:long $long }
return $address
return $lat
6 / 24
41. Same Language for each Format
for var in Expr
let var := Expr
for SelectSpec
varlist
from RelationList
DatasetClause
where Expr
order by Expr }
{ pattern
where WhereSpecList
return Expr
return Expr
XSPARQL
XSPARQL
XQuery $long
for $lat from <geonames.org/..>
for address as $address from clients
for $person in geo:lat $lat;
where { $point doc("eat.ie/..")
where person = "Nuno"
return $person//name $long }
geo:long
return $address
return $lat
6 / 24
42. Same Language for each Format
for var in Expr
let var := Expr
for SelectSpec
varlist
from RelationList
DatasetClause
where Expr
order by Expr }
{ pattern
where WhereSpecList
return Expr
return Expr
XSPARQL
XSPARQL
XSPARQL
for $lat $long from <geonames.org/..>
for address as $address from clients
for $person in doc("eat.ie/..")
where { $point geo:lat $lat;
where person = "Nuno"
return $person//name $long }
geo:long
return $address
return $lat
6 / 24
43. Same Language for each Format
for var in Expr
let var := Expr
for SelectSpec
varlist
from RelationList
DatasetClause
where Expr
order by Expr }
{ pattern
where WhereSpecList
return Expr
return Expr
XSPARQL
XSPARQL
for $lat $long doc("eat.ie/..")
for $person in from <geonames.org/..>
for address as $address from clients
where {$person//name $lat;
$point geo:lat
returnperson = "Nuno"
where
geo:long $long }
return $address
return $lat
6 / 24
44. Same Language for each Format
for var in Expr
let var := Expr
for SelectSpec
varlist
from RelationList
DatasetClause
where Expr
order by Expr }
{ pattern
where WhereSpecList
return Expr
return Expr
XSPARQL
XSPARQL
for $lat $long doc("eat.ie/..")
for $person in from <geonames.org/..>
for address as $address from clients
where {$person//name $lat;
$point geo:lat
returnperson = "Nuno"
where
geo:long $long }
return $address
return $lat
6 / 24
45. Creating RDF with XSPARQL
Convert online orders into RDF Arbitrary XSPARQL
prefix : <http://pizza-vocab.ie/>
expressions in subject,
predicate, and object
for $person in doc("eat.ie/..")//name
construct { :meal a :FoodOrder; :author {$person} } construct
clause generates
RDF
7 / 24
46. Creating RDF with XSPARQL
Convert online orders into RDF Arbitrary XSPARQL
prefix : <http://pizza-vocab.ie/>
expressions in subject,
predicate, and object
for $person in doc("eat.ie/..")//name
construct { :meal a :FoodOrder; :author {$person} } construct
clause generates
RDF
7 / 24
47. Creating RDF with XSPARQL
Convert online orders into RDF Arbitrary XSPARQL
prefix : <http://pizza-vocab.ie/>
expressions in subject,
predicate, and object
for $person in doc("eat.ie/..")//name
construct { :meal a :FoodOrder; :author {$person} } construct
clause generates
RDF
7 / 24
48. Creating RDF with XSPARQL
Convert online orders into RDF Arbitrary XSPARQL
prefix : <http://pizza-vocab.ie/>
expressions in subject,
predicate, and object
for $person in doc("eat.ie/..")//name
construct { :meal a :FoodOrder; :author {$person} } construct
clause generates
RDF
Query result
@prefix : <http://pizza-vocab.ie/> .
:meal a :FoodOrder .
:meal :author "Nuno" .
7 / 24
49. Integration Query Example
“Display pizza deliveries in Google Maps using KML”
for $person in doc("eat.ie/...")//name
for address as $address from clients
where person = $person
let $uri := fn:concat(
"api.geonames.org/search?q=", $address)
for $lat $long from $uri
where { $point geo:lat $lat; geo:long $long }
return <kml>
<lat>{$lat}<lat>
<long>{$long}</long>
</kml>
8 / 24
50. Integration Query Example
“Display pizza deliveries in Google Maps using KML”
for $person in doc("eat.ie/...")//name
for address as $address from clients
where person = $person
let $uri := fn:concat(
"api.geonames.org/search?q=", $address)
for $lat $long from $uri
where { $point geo:lat $lat; geo:long $long }
return <kml>
<lat>{$lat}<lat>
<long>{$long}</long>
</kml>
8 / 24
51. Integration Query Example
“Display pizza deliveries in Google Maps using KML”
for $person in doc("eat.ie/...")//name
for address as $address from clients
where person = $person
let $uri := fn:concat(
"api.geonames.org/search?q=", $address)
for $lat $long from $uri
where { $point geo:lat $lat; geo:long $long }
return <kml>
<lat>{$lat}<lat>
<long>{$long}</long>
</kml>
8 / 24
52. Integration Query Example
“Display pizza deliveries in Google Maps using KML”
for $person in doc("eat.ie/...")//name
for address as $address from clients
where person = $person
let $uri := fn:concat(
"api.geonames.org/search?q=", $address)
for $lat $long from $uri
where { $point geo:lat $lat; geo:long $long }
return <kml>
<lat>{$lat}<lat>
<long>{$long}</long>
</kml>
8 / 24
53. Integration Query Example
“Display pizza deliveries in Google Maps using KML”
for $person in doc("eat.ie/...")//name
for address as $address from clients
where person = $person
let $uri := fn:concat(
"api.geonames.org/search?q=", $address)
for $lat $long from $uri
where { $point geo:lat $lat; geo:long $long }
return <kml>
<lat>{$lat}<lat>
<long>{$long}</long>
</kml>
8 / 24
54. Integration Query Example
“Display pizza deliveries in Google Maps using KML”
for $person in doc("eat.ie/...")//name
for address as $address from clients
where person = $person
let $uri := fn:concat(
"api.geonames.org/search?q=", $address)
for $lat $long from $uri
where { $point geo:lat $lat; geo:long $long }
return <kml>
<lat>{$lat}<lat>
<long>{$long}</long>
</kml>
More involved XSPARQL queries: RDB2RDF
Direct Mapping: ∼130 LOC
R2RML: ∼290 LOC
8 / 24
55. Language Semantics
Extension of the XQuery Evaluation Semantics
dynEnv.globalPosition = Pos 1 , · · · , Pos j
dynEnv fs:dataset(DatasetClause) ⇒ Dataset
for $person dynEnv fs:sparql Dataset ,,WhereClause,
in doc("eat.ie/...")//name
dynEnv
Dataset WhereClause,
fs:sparql SolutionModifier eval(SQL SELECT)
⇒ µ1 ,,......,,µm
⇒ µ1 µm
SolutionModifier
for address as $address from clients dynEnv.varValue ⇒
where person globalPosition Pos 1 , · · · , Pos j , 1
dynEnv + = $person relation dynEnv.varValue
+activeDataset(Dataset)
let $uri + varValue Var 1; ⇒ fs:value µ1 , Var 1 ;
:= fn:concat(. .. ExprSingle ⇒ Value 1
"api.geonames.org/search?q=",, Var n
Var n ⇒ fs:value µ1 $address) person
for $lat $long from $uri .
where { $point geo:lat $lat; geo:long $long }”nuno”
.
.
return <kml>
dynEnv + globalPosition Pos 1 , · · · , Pos j , m + activeDataset(Dataset)
dynEnv.varValue ⇒
Var 1 ⇒
<lat>{$lat}<lat> fs:value µm , Var 1 ;
eval(SPARQL SELECT)
+ varValue ... ;
<long>{$long}</long>
Var n ⇒ fs:value µm , Var n
SPARQL compatible
solutions sol. seq.
ExprSingle ⇒ Value m
</kml> for $Var · · · $Var DatasetClause {{ person ⇒ ”nuno” }}
1 n dynEnv.varValue
dynEnv WhereClause SolutionModifier ⇒ Value 1 , . . . , Value m
return ExprSingle
9 / 24
56. Language Semantics
Extension of the XQuery Evaluation Semantics
for add variables and values to the dynamic environment
dynEnv.globalPosition = Pos 1 , · · · , Pos j
dynEnv fs:dataset(DatasetClause) ⇒ Dataset
for $person dynEnv fs:sparql Dataset ,,WhereClause,
in doc("eat.ie/...")//name
dynEnv
Dataset WhereClause,
fs:sparql SolutionModifier eval(SQL SELECT)
⇒ µ1 ,,......,,µm
⇒ µ1 µm
SolutionModifier
for address as $address from clients dynEnv.varValue ⇒
where person globalPosition Pos 1 , · · · , Pos j , 1
dynEnv + = $person relation dynEnv.varValue
+activeDataset(Dataset)
let $uri + varValue Var 1; ⇒ fs:value µ1 , Var 1 ;
:= fn:concat(. .. ExprSingle ⇒ Value 1
"api.geonames.org/search?q=",, Var n
Var n ⇒ fs:value µ1 $address) person
for $lat $long from $uri .
where { $point geo:lat $lat; geo:long $long }”nuno”
.
.
return <kml>
dynEnv + globalPosition Pos 1 , · · · , Pos j , m + activeDataset(Dataset)
dynEnv.varValue ⇒
Var 1 ⇒
<lat>{$lat}<lat> fs:value µm , Var 1 ;
eval(SPARQL SELECT)
+ varValue ... ;
<long>{$long}</long>
Var n ⇒ fs:value µm , Var n
SPARQL compatible
solutions sol. seq.
ExprSingle ⇒ Value m
</kml> for $Var · · · $Var DatasetClause {{ person ⇒ ”nuno” }}
1 n dynEnv.varValue
dynEnv WhereClause SolutionModifier ⇒ Value 1 , . . . , Value m
return ExprSingle
9 / 24
57. Language Semantics
Extension of the XQuery Evaluation Semantics
for add variables and values to the dynamic environment
Reuse semantics of original languages for the new expressions
dynEnv.globalPosition = Pos 1 , · · · , Pos j
dynEnv fs:dataset(DatasetClause) ⇒ Dataset
for $person dynEnv fs:sparql Dataset ,,WhereClause,
in doc("eat.ie/...")//name
dynEnv
Dataset WhereClause,
fs:sparql SolutionModifier eval(SQL SELECT)
⇒ µ1 ,,......,,µm
⇒ µ1 µm
SolutionModifier
for address as $address from clients dynEnv.varValue ⇒
where person globalPosition Pos 1 , · · · , Pos j , 1
dynEnv + = $person relation dynEnv.varValue
+activeDataset(Dataset)
let $uri + varValue Var 1; ⇒ fs:value µ1 , Var 1 ;
:= fn:concat(. .. ExprSingle ⇒ Value 1
"api.geonames.org/search?q=",, Var n
Var n ⇒ fs:value µ1 $address) person
for $lat $long from $uri .
where { $point geo:lat $lat; geo:long $long }”nuno”
.
.
return <kml>
dynEnv + globalPosition Pos 1 , · · · , Pos j , m + activeDataset(Dataset)
dynEnv.varValue ⇒
Var 1 ⇒
<lat>{$lat}<lat> fs:value µm , Var 1 ;
eval(SPARQL SELECT)
+ varValue ... ;
<long>{$long}</long>
Var n ⇒ fs:value µm , Var n
SPARQL compatible
solutions sol. seq.
ExprSingle ⇒ Value m
</kml> for $Var · · · $Var DatasetClause {{ person ⇒ ”nuno” }}
1 n dynEnv.varValue
dynEnv WhereClause SolutionModifier ⇒ Value 1 , . . . , Value m
return ExprSingle
9 / 24
58. Language Semantics
Extension of the XQuery Evaluation Semantics
for add variables and values to the dynamic environment
Reuse semantics of original languages for the new expressions
dynEnv.globalPosition = Pos 1 , · · · , Pos j
dynEnv fs:dataset(DatasetClause) ⇒ Dataset
for $person dynEnv fs:sparql Dataset ,,WhereClause,
in doc("eat.ie/...")//name
dynEnv
Dataset WhereClause,
fs:sparql SolutionModifier eval(SQL SELECT)
⇒ µ1 ,,......,,µm
⇒ µ1 µm
SolutionModifier
for address as $address from clients dynEnv.varValue ⇒
where person globalPosition Pos 1 , · · · , Pos j , 1
dynEnv + = $person relation dynEnv.varValue
+activeDataset(Dataset)
let $uri + varValue Var 1; ⇒ fs:value µ1 , Var 1 ;
:= fn:concat(. .. ExprSingle ⇒ Value 1
"api.geonames.org/search?q=",, Var n
Var n ⇒ fs:value µ1 $address) person
for $lat $long from $uri .
where { $point geo:lat $lat; geo:long $long }”nuno”
.
.
return <kml>
dynEnv + globalPosition Pos 1 , · · · , Pos j , m + activeDataset(Dataset)
dynEnv.varValue ⇒
Var 1 ⇒
<lat>{$lat}<lat> fs:value µm , Var 1 ;
eval(SPARQL SELECT)
+ varValue ... ;
<long>{$long}</long>
Var n ⇒ fs:value µm , Var n
SPARQL compatible
solutions sol. seq.
ExprSingle ⇒ Value m
</kml> for $Var · · · $Var DatasetClause {{ person ⇒ ”nuno” }}
1 n dynEnv.varValue
dynEnv WhereClause SolutionModifier ⇒ Value 1 , . . . , Value m
return ExprSingle
9 / 24
59. Language Semantics
Extension of the XQuery Evaluation Semantics
for add variables and values to the dynamic environment
Reuse semantics of original languages for the new expressions
dynEnv.globalPosition = Pos 1 , · · · , Pos j
dynEnv fs:dataset(DatasetClause) ⇒ Dataset
for $person dynEnv fs:sparql Dataset ,,WhereClause,
in doc("eat.ie/...")//name
dynEnv
Dataset WhereClause,
fs:sparql SolutionModifier eval(SQL SELECT)
⇒ µ1 ,,......,,µm
⇒ µ1 µm
SolutionModifier
for address as $address from clients dynEnv.varValue ⇒
where person globalPosition Pos 1 , · · · , Pos j , 1
dynEnv + = $person relation dynEnv.varValue
+activeDataset(Dataset)
let $uri + varValue Var 1; ⇒ fs:value µ1 , Var 1 ;
:= fn:concat(. .. ExprSingle ⇒ Value 1
"api.geonames.org/search?q=",, Var n
Var n ⇒ fs:value µ1 $address) person
for $lat $long from $uri .
where { $point geo:lat $lat; geo:long $long }”nuno”
.
.
return <kml>
dynEnv + globalPosition Pos 1 , · · · , Pos j , m + activeDataset(Dataset)
dynEnv.varValue ⇒
Var 1 ⇒
<lat>{$lat}<lat> fs:value µm , Var 1 ;
eval(SPARQL SELECT)
+ varValue ... ;
<long>{$long}</long>
Var n ⇒ fs:value µm , Var n
SPARQL compatible
solutions sol. seq.
ExprSingle ⇒ Value m
</kml> for $Var · · · $Var DatasetClause {{ person ⇒ ”nuno” }}
1 n dynEnv.varValue
dynEnv WhereClause SolutionModifier ⇒ Value 1 , . . . , Value m
return ExprSingle
9 / 24
60. Language Semantics
Extension of the XQuery Evaluation Semantics
for add variables and values to the dynamic environment
Reuse semantics of original languages for the new expressions
dynEnv.globalPosition = Pos 1 , · · · , Pos j
dynEnv fs:dataset(DatasetClause) ⇒ Dataset
for $person dynEnv fs:sparql Dataset ,,WhereClause,
in doc("eat.ie/...")//name
dynEnv
Dataset WhereClause,
fs:sparql SolutionModifier eval(SQL SELECT)
⇒ µ1 ,,......,,µm
⇒ µ1 µm
SolutionModifier
for address as $address from clients dynEnv.varValue ⇒
where person globalPosition Pos 1 , · · · , Pos j , 1
dynEnv + = $person relation dynEnv.varValue
+activeDataset(Dataset)
let $uri + varValue Var 1; ⇒ fs:value µ1 , Var 1 ;
:= fn:concat(. .. ExprSingle ⇒ Value 1
"api.geonames.org/search?q=",, Var n
Var n ⇒ fs:value µ1 $address) person
for $lat $long from $uri .
where { $point geo:lat $lat; geo:long $long }”nuno”
.
.
return <kml>
dynEnv + globalPosition Pos 1 , · · · , Pos j , m + activeDataset(Dataset)
dynEnv.varValue ⇒
Var 1 ⇒
<lat>{$lat}<lat> fs:value µm , Var 1 ;
eval(SPARQL SELECT)
+ varValue ... ;
<long>{$long}</long>
Var n ⇒ fs:value µm , Var n
SPARQL compatible
solutions sol. seq.
ExprSingle ⇒ Value m
</kml> for $Var · · · $Var DatasetClause {{ person ⇒ ”nuno” }}
1 n dynEnv.varValue
dynEnv WhereClause SolutionModifier ⇒ Value 1 , . . . , Value m
return ExprSingle
9 / 24
61. Language Semantics
Extension of the XQuery Evaluation Semantics
for add variables and values to the dynamic environment
Reuse semantics of original languages for the new expressions
dynEnv.globalPosition = Pos 1 , · · · , Pos j
dynEnv fs:dataset(DatasetClause) ⇒ Dataset
for $person dynEnv fs:sparql Dataset ,,WhereClause,
in doc("eat.ie/...")//name
dynEnv
Dataset WhereClause,
fs:sparql SolutionModifier eval(SQL SELECT)
⇒ µ1 ,,......,,µm
⇒ µ1 µm
SolutionModifier
for address as $address from clients dynEnv.varValue ⇒
where person globalPosition Pos 1 , · · · , Pos j , 1
dynEnv + = $person relation dynEnv.varValue
+activeDataset(Dataset)
let $uri + varValue Var 1; ⇒ fs:value µ1 , Var 1 ;
:= fn:concat(. .. ExprSingle ⇒ Value 1
"api.geonames.org/search?q=",, Var n
Var n ⇒ fs:value µ1 $address) person
for $lat $long from $uri .
where { $point geo:lat $lat; geo:long $long }”nuno”
.
.
return <kml>
dynEnv + globalPosition Pos 1 , · · · , Pos j , m + activeDataset(Dataset)
dynEnv.varValue ⇒
Var 1 ⇒
<lat>{$lat}<lat> fs:value µm , Var 1 ;
eval(SPARQL SELECT)
+ varValue ... ;
<long>{$long}</long>
Var n ⇒ fs:value µm , Var n
SPARQL compatible
solutions sol. seq.
ExprSingle ⇒ Value m
</kml> for $Var · · · $Var DatasetClause {{ person ⇒ ”nuno” }}
1 n dynEnv.varValue
dynEnv WhereClause SolutionModifier ⇒ Value 1 , . . . , Value m
return ExprSingle
9 / 24
62. Language Semantics
Extension of the XQuery Evaluation Semantics
for add variables and values to the dynamic environment
Reuse semantics of original languages for the new expressions
dynEnv.globalPosition = Pos 1 , · · · , Pos j
dynEnv fs:dataset(DatasetClause) ⇒ Dataset
for $person dynEnv fs:sparql Dataset ,,WhereClause,
in doc("eat.ie/...")//name
dynEnv
Dataset WhereClause,
fs:sparql SolutionModifier eval(SQL SELECT)
⇒ µ1 ,,......,,µm
⇒ µ1 µm
SolutionModifier
for address as $address from clients dynEnv.varValue ⇒
where person globalPosition Pos 1 , · · · , Pos j , 1
dynEnv + = $person relation dynEnv.varValue
+activeDataset(Dataset)
let $uri + varValue Var 1; ⇒ fs:value µ1 , Var 1 ;
:= fn:concat(. .. ExprSingle ⇒ Value 1
"api.geonames.org/search?q=",, Var n
Var n ⇒ fs:value µ1 $address) person
for $lat $long from $uri .
where { $point geo:lat $lat; geo:long $long }”nuno”
.
.
return <kml>
dynEnv + globalPosition Pos 1 , · · · , Pos j , m + activeDataset(Dataset)
dynEnv.varValue ⇒
Var 1 ⇒
<lat>{$lat}<lat> fs:value µm , Var 1 ;
eval(SPARQL SELECT)
+ varValue ... ;
<long>{$long}</long>
Var n ⇒ fs:value µm , Var n
SPARQL compatible
solutions sol. seq.
ExprSingle ⇒ Value m
</kml> for $Var · · · $Var DatasetClause {{ person ⇒ ”nuno” }}
1 n dynEnv.varValue
dynEnv WhereClause SolutionModifier ⇒ Value 1 , . . . , Value m
return ExprSingle
9 / 24
63. Language Semantics
Extension of the XQuery Evaluation Semantics
for add variables and values to the dynamic environment
Reuse semantics of original languages for the new expressions
dynEnv.globalPosition = Pos 1 , · · · , Pos j
dynEnv fs:dataset(DatasetClause) ⇒ Dataset
for $person dynEnv fs:sparql Dataset ,,WhereClause,
in doc("eat.ie/...")//name
dynEnv
Dataset WhereClause,
fs:sparql SolutionModifier eval(SQL SELECT)
⇒ µ1 ,,......,,µm
⇒ µ1 µm
SolutionModifier
for address as $address from clients dynEnv.varValue ⇒
where person globalPosition Pos 1 , · · · , Pos j , 1
dynEnv + = $person relation dynEnv.varValue
+activeDataset(Dataset)
let $uri + varValue Var 1; ⇒ fs:value µ1 , Var 1 ;
:= fn:concat(. .. ExprSingle ⇒ Value 1
"api.geonames.org/search?q=",, Var n
Var n ⇒ fs:value µ1 $address) person
for $lat $long from $uri .
where { $point geo:lat $lat; geo:long $long }”nuno”
.
.
return <kml>
dynEnv + globalPosition Pos 1 , · · · , Pos j , m + activeDataset(Dataset)
dynEnv.varValue ⇒
Var 1 ⇒
<lat>{$lat}<lat> fs:value µm , Var 1 ;
eval(SPARQL SELECT)
+ varValue ... ;
<long>{$long}</long>
Var n ⇒ fs:value µm , Var n
SPARQL compatible
solutions sol. seq.
ExprSingle ⇒ Value m
</kml> for $Var · · · $Var DatasetClause {{ person ⇒ ”nuno” }}
1 n dynEnv.varValue
dynEnv WhereClause SolutionModifier ⇒ Value 1 , . . . , Value m
return ExprSingle
9 / 24
65. Language Implementation
Reuse components (SPARQL engine, Relational Database)
Implemented XSPARQL by rewriting to XQuery
10 / 24
66. Language Implementation
Reuse components (SPARQL engine, Relational Database)
Implemented XSPARQL by rewriting to XQuery
Semantics implemented by substitution of bound variables
10 / 24
67. Language Implementation
Reuse components (SPARQL engine, Relational Database)
Implemented XSPARQL by rewriting to XQuery
Semantics implemented by substitution of bound variables
Bound Variable Substitution
Bound variables are replaced by their value at runtime
for $person in doc("eat.ie/...")//name
for address as $address from clients fn:concat(
where person = $person "SELECT address from clients
where person = ", $person)
10 / 24
68. Language Implementation
Reuse components (SPARQL engine, Relational Database)
Implemented XSPARQL by rewriting to XQuery
Semantics implemented by substitution of bound variables
Bound Variable Substitution
Bound variables are replaced by their value at runtime
Implemented in the generated XQuery
for $person in doc("eat.ie/...")//name
for address as $address from clients fn:concat(
where person = $person "SELECT address from clients
where person = ", $person)
10 / 24
69. Language Implementation
Reuse components (SPARQL engine, Relational Database)
Implemented XSPARQL by rewriting to XQuery
Semantics implemented by substitution of bound variables
Bound Variable Substitution
Bound variables are replaced by their value at runtime
Implemented in the generated XQuery
Pushing the variable bindings into the respective query engine
for $person in doc("eat.ie/...")//name
for address as $address from clients fn:concat(
where person = $person "SELECT address from clients
where person = ", $person)
10 / 24
70. Implementation
XSPARQL XML /
query RDF
XSPARQL
Enhanced
Query XQuery
XQuery
Rewriter query
engine
SQL XQuery SPARQL
Custom XQuery func-
tions implemented as
Java methods RDB XML
XML
XML RDF
RDF
RDF
11 / 24
71. Implementation
XSPARQL XML /
query RDF
XSPARQL
Enhanced
Query XQuery
XQuery
Rewriter query
engine
SQL XQuery SPARQL
Custom XQuery func-
tions implemented as
Java methods RDB XML
XML
XML RDF
RDF
RDF
11 / 24
72. Implementation
XSPARQL XML /
query RDF
XSPARQL
Enhanced
Query XQuery
XQuery
Rewriter query
engine
SQL XQuery SPARQL
Custom XQuery func-
tions implemented as
Java methods RDB XML
XML
XML RDF
RDF
RDF
11 / 24
73. Implementation
XSPARQL XML /
query RDF
XSPARQL
Enhanced
Query XQuery
XQuery
Rewriter query
engine
SQL XQuery SPARQL
Custom XQuery func-
tions implemented as
Java methods RDB XML
XML
XML RDF
RDF
RDF
11 / 24
77. XSPARQL Evaluation
XMark XMarkRDF XMarkRDB
Experimental Results
Compared to native XQuery (XMark)
RDB and RDF: same order of magnitude for most queries
12 / 24
78. XSPARQL Evaluation
XMark XMarkRDF XMarkRDB
Experimental Results
Compared to native XQuery (XMark)
RDB and RDF: same order of magnitude for most queries
Except on RDF nested queries (self joining data)
several orders of magnitude slower
due to the number of calls to the SPARQL engine
12 / 24
79. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
80. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
81. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
82. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
83. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
84. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
85. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
86. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
87. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
88. Rewriting techniques for Nested Queries
Q8 : “List the names of persons and the number of items they bought.”
let $aux := sparqlQuery(
fn:concat(
"SELECT * from <input.rdf>
nuno ,
nuno
where { $ca :buyer $person .}" )
) Returns axel ,
axel
.
.
for $person $name from <input.rdf> .
Nested Loop rewriting
where { $person foaf:name $name }
return <item person="{$name}">
{count( sparqlQuery(
for * from <input.rdf> fn:concat(
Join in XQuery
where { $ca :buyer $person .} Other optimisations:
Applied to*related language:
"SELECT from <input.rdf>
return $ca where { $ca :buyer ", $person,
nuno
)} Join in SPARQL
SPARQL2XQuery [SAC2008]
" .}")
axel
</item> )
13 / 24
89. Evaluation of different rewritings - XMarkRDF Q8
Time in seconds (log scale)
XSPARQLRDF
102
Nested Loop
101
SPARQL rewrite
1 2 5 10 20 50 100
Dataset size in MB (log scale)
Optimised rewritings show promising results
Best Results: SPARQL-based
Results included in [JoDS2012]
14 / 24
90. Evaluation of different rewritings - XMarkRDB Q8
Time in seconds (log scale)
Nested Loop
102
101
XSPARQLRDB
100
1 2 5 10 20 50 100
Dataset size in MB (log scale)
Not so effective for RDB, requires different optimisations
Due to schema representation?
Additional Results for this thesis
15 / 24
91. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address> X SPA R QL
</person> M D
2012 L F
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
16 / 24
92. Data Integration: Pizza Delivery Company
RDF
XML
Clients
<person>
/ Orders
<name>Nuno</name>
<address>Dublin</address> X SPA R QL
</person> M D
2012 L F
XML RDB
Clients
From 2008 to 2012
/ Orders
Deliveries person address
Nuno Galway
16 / 24