SlideShare une entreprise Scribd logo
1  sur  31
Supporting virtual integration of
Linked Data
with just-in-time query recompilation
Amsterdam, The Netherlands, September 12 2017
Alessandro Adamou1, Mathieu d’Aquin2, Carlo Allocca13, Enrico Motta1
1 Knowledge Media Institute, The Open University, UK
2 Insight Centre for Data Analytics, NUI Galway, Ireland
3 now Samsung Inc.
Outline
•  Motivation
•  Just-in-time query recompilation
•  Implementation
•  Experiments
•  Perspectives
Virtual data integration
•  No ETL process
•  Naturally keeps data up-to-date
•  Unlike data federation, there is still a designated node
•  Favours project economies relying on networking rather
than storage space
•  Serious performance issues!
•  Generally considered less robust
•  Acquiring momentum in industry, per 2016 Gartner
report
•  Maintenance??
Pay-as-you-go integration
1.  Establish mappings between source
schemas and global schema (go)
2.  Obtain feedback on mapping results, e.g.
in terms of precision and recall (pay)
3.  Refine 1
Really “pays off” in virtual data integration
N.W.	Paton,	K.	Christodoulou,	A.	A.	A.	Fernandes,	B.	Parsia,	C.	Hedeler:	Pay-as-you-go	data	
integra0on	for	linked	data:	opportuni0es,	challenges	and	architectures.	SWIM	2012:	3
Outline
•  Motivation
•  Just-in-time query recompilation
•  Implementation
•  Experiments
•  Perspectives
•  Target query language other than
SPARQL
– Conjunctive, star-shaped queries (good for
Web APIs)
{hostname}{/attribute/value}+?
{attribute}{&attribute}+
e.g.
http://example.org/api/type/actor/
name/clint_eastwood?filmwork
type/actor/name/clint_eastwood
SELECT DISTINCT ?filmwork WHERE {

{ dbr:Clint_Eastwood a dbo:Actor
; ^(dbo:director|dbo:starring) ?filmwork
} UNION {
{ ?x owl:sameAs dbr:Clint_Eastwood
} UNION {
?x (movie:actor_name|movie:director_name) ?s
FILTER (str(?s) = "Clint Eastwood")

}
. ?x foaf:made|^(movie:director|movie:actor) ?filmwork
}}
Equivalent SPARQL query for federated engine
that supports DBpedia and LinkedMDB
type/actor/name/clint_eastwood
SELECT DISTINCT ?filmwork WHERE {

{ dbr:Clint_Eastwood a dbo:Actor
; ^(dbo:director|dbo:starring) ?filmwork
} UNION {
{ ?x owl:sameAs dbr:Clint_Eastwood
} UNION {
?x (movie:actor_name|movie:director_name) ?s
FILTER (str(?s) = "Clint Eastwood")

}
. ?x foaf:made|^(movie:director|movie:actor) ?filmwork
}}
Equivalent SPARQL query for federated engine
that supports DBpedia and LinkedMDB
type/actor/name/clint_eastwood
SELECT DISTINCT ?filmwork WHERE {

{ dbr:Clint_Eastwood a dbo:Actor
; ^(dbo:director|dbo:starring) ?filmwork
} UNION {
{ ?x owl:sameAs dbr:Clint_Eastwood
} UNION {
?x (movie:actor_name|movie:director_name) ?s
FILTER (str(?s) = "Clint Eastwood")

}
. ?x foaf:made|^(movie:director|movie:actor) ?filmwork
}}
Equivalent SPARQL query for federated engine
that supports DBpedia and LinkedMDB
type/actor/name/clint_eastwood
SELECT DISTINCT ?filmwork ?eq WHERE {
VALUES(?x) {
( dbr:Clint_Eastwood )
( dbr:Clint_Eastwood_(actor) )
} ?x a dbo:Actor
; ^(dbo:director|dbo:starring) ?filmwork
. ?filmwork owl:sameAs|^owl:sameAs ?eq
}
Alternative approach: DBpedia
type/actor/name/clint_eastwood
SELECT DISTINCT ?filmwork ?eq WHERE {
{ VALUES(?x0) {
( dbr:Clint_Eastwood )
( dbr:Clint_Eastwood_(actor) )
} ?x0 ^owl:sameAs ?x
} UNION { ?x movie:actor_name "Clint Eastwood"
} UNION { ?x movie:director_name "Clint Eastwood"
} . {
{ ?x foaf:made ?filmwork
} UNION { ?filmwork movie:director ?x
} UNION { ?filmwork movie:actor ?x }

} . ?filmwork owl:sameAs|^owl:sameAs ?eq
}
Alternative approach: LinkedMDB
•  Encode integrator’s knowledge of a
dataset schema into a set of primitives,
which will serve as “compilation units”.
•  Managing the compilation units of a query
using two types of structure:
– Microcompilers
– Query skeletons (or templates)
Microcompiler
Let	W	be	the	set	of	all	the	a;ribute-value	pairs	
and	Σ	the	alphabet	of	a	language;	a	
microcompiler	is	a	func0on	φ : ℘W → Σ∗ that	
transforms	sets	of	a;ribute-value	pairs	into	a	
sequence	of	symbols	in	that	language.
Microcompiler (JS ex.)
mc_x_dbp = function (type,name) {
var pref = ‘http://dbpedia.org/resource/’;
var idd = name.replace(/b[a-z]/g,
function(f){return f.toUpperCase()});
return ‘VALUES(?x_dbp){ ‘
+ ‘( <’ + pref + idd + ‘> )’
+ ‘( <’ + pref + idd + ‘_(’ + type + ‘)> )’
+ ‘} ?x_dbp’
}
type/actor/name/clint_eastwood
VALUES(?x_dbp) {
( <http://dbpedia.org/resource/Clint_Eastwood> )
( <http://dbpedia.org/resource/Clint_Eastwood_(actor)> )
} ?x_dbp
Microcompiler (JS ex. II)
mc_x_lmdb = function (type,name) {
if( [‘actor’,’director’].indexof(type) >= 0 ) {
var sa = mc_x_dbp(type,name) + ‘ ^owl:sameAs ?x_lmdb’;
var nam = makename(name); // omitted for simplicity
return ‘{ ‘ + sa + ’ } UNION ’
+ ‘{ ?x_lmdb movie:actor_name “’+nam+’” } UNION’
+ ‘{ ?x_lmdb movie:director_name “’+nam+’” }’ //…
}}
type/actor/name/clint_eastwood
{ VALUES(?x_dbp) {
( <http://dbpedia.org/resource/Clint_Eastwood> )
( <http://dbpedia.org/resource/Clint_Eastwood_(actor)> )
} ?x_dbp ^owl:sameAs ?x_lmdb }
UNION {?x_lmdb movie:actor_name “Clint Eastwood” }
UNION {?x_lmdb movie:director_name “Clint Eastwood” }
Query skeleton
A query skeleton, or query template, t is a
member of (Σ∪C)∗, where C is an alphabet
called set of control symbols.
<[name]> ^(dbo:director|dbo:starring) ?[filmwork]?
{
{ <[name]> foaf:made ?[filmwork]? }
UNION { ?[filmwork]? movie:director ?x_lmdb }
UNION { ?[filmwork]? movie:actor ?x_lmdb }

} . ?[filmwork]? owl:sameAs|^owl:sameAs ?eq
Example	II	(LinkedMDB,	{filmwork,name}	
Example	I	(DBpedia,	{filmwork,name}
JIT framework
Data	source	
selecLon	
strategy	
micro	
compilers	
query	
skeletons	
micro	
compilers	
query	
skeletons	
micro	
compilers	
query	
skeletons	
…	
Compiler	
compiler:	funcLon	Φ × ℘(Σ∪C)∗ × ℘W → L
Target	
queries	
Source	
query	
Target	
queries	Target	
queries
Compilation strategies
•  A manifest is a pair of sets of
microcompilers and query skeletons
•  Grouping into manifests for:
– (a) data sources;
– (b) entity types
•  Data source selection algorithm produces
a set of datasource-query pairs by finding
satisfiable query skeletons (on paper)
Outline
•  Motivation
•  Just-in-time query recompilation
•  Implementation
•  Experiments
•  Perspectives
M.d'Aquin,	A.	Adamou,	E.	Daga,	S.	Liu,	K.	Thomas,	E.	MoTa:	Dealing	
with	Diversity	in	a	Smart-City	Datahub.	SemanLcs	for	Smarter	CiLes	
@ISWC	2014:	68-82	
Big Data for Milton Keynes
as a Smart City
EnLty-centric	data	API	based	on	a	simplified	language	from	
the	one	of	this	presentaLon	
	
•  hTps://datahub.mksmart.org	
•  hTps://github.com/mk-smart/enLty-centric-api
Implementation
•  Reference open source implementation
written in Java
–  With support for SPARQL and HTTP
dereferencing of RDF
–  Includes JIT logic, custom experimental VDIS and
HTTP API
•  Accepts microcompilers in JavaScript
•  Apache CouchDB map-reduce for atomically
retrieving candidate compilation units
Outline
•  Motivation
•  Just-in-time query recompilation
•  Implementation
•  Experiments
•  Perspectives
Experiments
What is the price paid to turn a federated
query engine into a virtual data integration
system using JIT recompilation?
Experiments
1.  Benchmark of FedBench1 queries translated into our target
language
2.  Take a federated query engine (FedX)2
3.  Measure the time taken by FedX to execute the original
FedBench SPARQL query
–  On the live endpoints whenever possible
4.  Take the translated query and recompile them into one or
more SPARQL queries (at most one per data source)
–  Execute each query with FedX
5.  Measure for each:
–  Increase in size of “correct” result set
–  Recompilation overhead
–  Overall turnaround time of queries
1	hTp://fedbench.fluidops.net		
2	hTps://www.fluidops.com/en/company/knowledge/open_source
Experiments
Example: FedBench Cross-domain CD3
CD3	(original)	
SELECT ?pres ?party ?page WHERE {
?pres rdf:type dbpedia-owl:President .
?pres dbpedia-owl:nationality dbpedia:United_States .
?pres dbpedia-owl:party ?party
. ?x nytimes:topicPage ?page
. ?x owl:sameAs ?pres
}
	
CD3C:	
type/president/country/united_states?party&webpage
Results I
Query	 Result	set	VDI	boost	 Notes	
FedBench	Cross-Domain	
CD1C	
m	*	1.387	
CD2C	
52	new	results	 Plain	FedX	yielded	no	results	
CD3C	
67	new	results	 Plain	FedX	yielded	no	results,	has	SERVICE	clause	
CD4C	
m	*	4480.0	
	
Some	microcompilers	perform	queries	
CD5C	
m	*	1.0	 No	increment	from	recompilaLon	
FedBench	Life	Sciences	
LS1C	
m	*	1.0	 Query	could	not	be	expanded	
LS2C	
m	*	1.0	 No	increment	from	recompilaLon	
LS3C	
70981	results	 Plain	FedX	crashed	
FedBench	Linked	Data	
LD5C	
m	∗	3.677	
LD9C	
4	new	results	 Plain	FedX	yielded	no	results	
LD10C	
m	*	17.0	
LD11C	
m	*	1.65
Results II
Query	 Time	(ms)	-	FedX	 Time	(ms)	– FedX+JIT	 JIT	overhead	 Query	TAT	
FedBench	Cross-Domain	
CD1C	 300	±	050	 420	±	109	 400	±	020	 800	±	120	
CD2C	 175	±	005	 475	±	055	 432	±	009	 1500	±	123	
CD3C	 158	±	004	 446	±	076	 408	±	106	 1067	±	048	
CD4C	 8835	±	954	 420	±	100	 787	±	165	 7480	±	569	
CD5C	 851	±	319	 519	±	145	 448	±	031	 548	±	061	
FedBench	Life	Sciences	
LS1C	
795	±	371	 892	±	043	 query	could	not	be	expanded	
LS2C	
484	±	166	 420	±	100	 444	±	061	 370	±	061	
LS3C	
!ERROR	 6653	±	861	 query	could	not	be	expanded	
FedBench	Linked	Data	
LD5C	
795	±	371	 801	±	078	 486	±	017	 1028	±	099	
LD9C	
484	±	166	 407	±	023	 390	±	039	 318	±	061	
LD10C	
189	±	036	 440	±	018	 416	±	017	 658	±	101	
LD11C	
387	±	067	 861	±	057	 406	±	020	 762	±	095
Outline
•  Motivation
•  Just-in-time query recompilation
•  Implementation
•  Experiments
•  Perspectives
Discussion
•  Can compile star-shaped input queries into more
complex target queries
•  Overhead is mostly a standard cost
•  Proves to be mostly efficient when also effective (i.e.
there is query expansion)
•  Cannot still substitute query federation optimisation
strategies
•  Manageability? We knew exactly how to proceed…
–  However we worked with ~ |A| · |MS| + |MT| microcompilers
and query skeletons, where it could have been up to |MT +
A| · |MS| + |MT|
Future work
•  Optimisations to abate JIT overhead
•  Application to chain-shaped queries and other
query types
•  Investigate other target languages
•  Investigate templating languages for query
skeletons
•  Cascaded mappings applied at query time (no
knowledge of dataset content or structure)
Thank You
Amsterdam, The Netherlands, September 12 2017
Alessandro Adamou1, Mathieu d’Aquin2, Carlo Allocca13, Enrico Motta1
1 Knowledge Media Institute, The Open University, UK
2 Insight Centre for Data Analytics, NUI Galway, Ireland
3 now Samsung Inc.

Contenu connexe

Tendances

Functional programming basics
Functional programming basicsFunctional programming basics
Functional programming basicsopenbala
 
手把手教你 R 語言分析實務
手把手教你 R 語言分析實務手把手教你 R 語言分析實務
手把手教你 R 語言分析實務Helen Chang
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching moduleSander Timmer
 
PCA11: Python for Product Managers
PCA11: Python for Product ManagersPCA11: Python for Product Managers
PCA11: Python for Product ManagersDavid Heller
 
Introduction to Gremlin
Introduction to GremlinIntroduction to Gremlin
Introduction to GremlinMax De Marzi
 
EuroPython 2015 - Big Data with Python and Hadoop
EuroPython 2015 - Big Data with Python and HadoopEuroPython 2015 - Big Data with Python and Hadoop
EuroPython 2015 - Big Data with Python and HadoopMax Tepkeev
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Data Modeling, Normalization and Denormalization | Nordic PGDay 2018 | Dimitr...
Data Modeling, Normalization and Denormalization | Nordic PGDay 2018 | Dimitr...Data Modeling, Normalization and Denormalization | Nordic PGDay 2018 | Dimitr...
Data Modeling, Normalization and Denormalization | Nordic PGDay 2018 | Dimitr...Citus Data
 
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...Citus Data
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandasPiyush rai
 
Visualization of Supervised Learning with {arules} + {arulesViz}
Visualization of Supervised Learning with {arules} + {arulesViz}Visualization of Supervised Learning with {arules} + {arulesViz}
Visualization of Supervised Learning with {arules} + {arulesViz}Takashi J OZAKI
 
Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2Charles Givre
 
Session 15 - Collections - Array List
Session 15 - Collections - Array ListSession 15 - Collections - Array List
Session 15 - Collections - Array ListPawanMM
 
What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0Makoto Yui
 
RDataMining slides-network-analysis-with-r
RDataMining slides-network-analysis-with-rRDataMining slides-network-analysis-with-r
RDataMining slides-network-analysis-with-rYanchang Zhao
 
Advanced WhizzML Workflows
Advanced WhizzML WorkflowsAdvanced WhizzML Workflows
Advanced WhizzML WorkflowsBigML, Inc
 
Intermediate WhizzML Workflows
Intermediate WhizzML WorkflowsIntermediate WhizzML Workflows
Intermediate WhizzML WorkflowsBigML, Inc
 

Tendances (20)

Functional programming basics
Functional programming basicsFunctional programming basics
Functional programming basics
 
手把手教你 R 語言分析實務
手把手教你 R 語言分析實務手把手教你 R 語言分析實務
手把手教你 R 語言分析實務
 
Linked lists
Linked listsLinked lists
Linked lists
 
Functional Scala 2020
Functional Scala 2020Functional Scala 2020
Functional Scala 2020
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
PCA11: Python for Product Managers
PCA11: Python for Product ManagersPCA11: Python for Product Managers
PCA11: Python for Product Managers
 
Introduction to Gremlin
Introduction to GremlinIntroduction to Gremlin
Introduction to Gremlin
 
EuroPython 2015 - Big Data with Python and Hadoop
EuroPython 2015 - Big Data with Python and HadoopEuroPython 2015 - Big Data with Python and Hadoop
EuroPython 2015 - Big Data with Python and Hadoop
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Data Modeling, Normalization and Denormalization | Nordic PGDay 2018 | Dimitr...
Data Modeling, Normalization and Denormalization | Nordic PGDay 2018 | Dimitr...Data Modeling, Normalization and Denormalization | Nordic PGDay 2018 | Dimitr...
Data Modeling, Normalization and Denormalization | Nordic PGDay 2018 | Dimitr...
 
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
 
Mongo indexes
Mongo indexesMongo indexes
Mongo indexes
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
 
Visualization of Supervised Learning with {arules} + {arulesViz}
Visualization of Supervised Learning with {arules} + {arulesViz}Visualization of Supervised Learning with {arules} + {arulesViz}
Visualization of Supervised Learning with {arules} + {arulesViz}
 
Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2Data Exploration with Apache Drill: Day 2
Data Exploration with Apache Drill: Day 2
 
Session 15 - Collections - Array List
Session 15 - Collections - Array ListSession 15 - Collections - Array List
Session 15 - Collections - Array List
 
What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0What's new in Hivemall v0.5.0
What's new in Hivemall v0.5.0
 
RDataMining slides-network-analysis-with-r
RDataMining slides-network-analysis-with-rRDataMining slides-network-analysis-with-r
RDataMining slides-network-analysis-with-r
 
Advanced WhizzML Workflows
Advanced WhizzML WorkflowsAdvanced WhizzML Workflows
Advanced WhizzML Workflows
 
Intermediate WhizzML Workflows
Intermediate WhizzML WorkflowsIntermediate WhizzML Workflows
Intermediate WhizzML Workflows
 

Similaire à Virtual data integration with just-in-time query recompilation

SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceUniversity of Washington
 
Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19ngamou
 
Building data fusion surrogate models for spacecraft aerodynamic problems wit...
Building data fusion surrogate models for spacecraft aerodynamic problems wit...Building data fusion surrogate models for spacecraft aerodynamic problems wit...
Building data fusion surrogate models for spacecraft aerodynamic problems wit...Shinwoo Jang
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases MongoDB
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL DatabasesReal-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL DatabasesEugene Dvorkin
 
Semantic Search and Result Presentation with Entity Cards
Semantic Search and Result Presentation with Entity CardsSemantic Search and Result Presentation with Entity Cards
Semantic Search and Result Presentation with Entity CardsFaegheh Hasibi
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for GraphsJean Ihm
 
Let's build a parser!
Let's build a parser!Let's build a parser!
Let's build a parser!Boy Baukema
 
Computer notes - data structures
Computer notes - data structuresComputer notes - data structures
Computer notes - data structuresecomputernotes
 
Kotlin+MicroProfile: Enseñando trucos de 20 años a un nuevo lenguaje
Kotlin+MicroProfile: Enseñando trucos de 20 años a un nuevo lenguajeKotlin+MicroProfile: Enseñando trucos de 20 años a un nuevo lenguaje
Kotlin+MicroProfile: Enseñando trucos de 20 años a un nuevo lenguajeVíctor Leonel Orozco López
 
Relaxing global-as-view in mediated data integration from linked data
Relaxing global-as-view in mediated data integration from linked dataRelaxing global-as-view in mediated data integration from linked data
Relaxing global-as-view in mediated data integration from linked dataAlessandro Adamou
 
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...Matthew J Collins
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1ecomputernotes
 
PyCon Ukraine 2016: Maintaining a high load Python project for newcomers
PyCon Ukraine 2016: Maintaining a high load Python project for newcomersPyCon Ukraine 2016: Maintaining a high load Python project for newcomers
PyCon Ukraine 2016: Maintaining a high load Python project for newcomersViach Kakovskyi
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Julian Hyde
 
Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014PyData
 
Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013Antonio De Marinis
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineeringJulian Hyde
 

Similaire à Virtual data integration with just-in-time query recompilation (20)

SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
 
Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19Collective entity linking with WSRM DocEng'19
Collective entity linking with WSRM DocEng'19
 
Building data fusion surrogate models for spacecraft aerodynamic problems wit...
Building data fusion surrogate models for spacecraft aerodynamic problems wit...Building data fusion surrogate models for spacecraft aerodynamic problems wit...
Building data fusion surrogate models for spacecraft aerodynamic problems wit...
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases
 
Real-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL DatabasesReal-Time Integration Between MongoDB and SQL Databases
Real-Time Integration Between MongoDB and SQL Databases
 
Semantic Search and Result Presentation with Entity Cards
Semantic Search and Result Presentation with Entity CardsSemantic Search and Result Presentation with Entity Cards
Semantic Search and Result Presentation with Entity Cards
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
 
Let's build a parser!
Let's build a parser!Let's build a parser!
Let's build a parser!
 
Computer notes - data structures
Computer notes - data structuresComputer notes - data structures
Computer notes - data structures
 
Kotlin+MicroProfile: Enseñando trucos de 20 años a un nuevo lenguaje
Kotlin+MicroProfile: Enseñando trucos de 20 años a un nuevo lenguajeKotlin+MicroProfile: Enseñando trucos de 20 años a un nuevo lenguaje
Kotlin+MicroProfile: Enseñando trucos de 20 años a un nuevo lenguaje
 
Relaxing global-as-view in mediated data integration from linked data
Relaxing global-as-view in mediated data integration from linked dataRelaxing global-as-view in mediated data integration from linked data
Relaxing global-as-view in mediated data integration from linked data
 
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...
Mining Whole Museum Collections Datasets for Expanding Understanding of Colle...
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1
 
PyCon Ukraine 2016: Maintaining a high load Python project for newcomers
PyCon Ukraine 2016: Maintaining a high load Python project for newcomersPyCon Ukraine 2016: Maintaining a high load Python project for newcomers
PyCon Ukraine 2016: Maintaining a high load Python project for newcomers
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
 
Polyalgebra
PolyalgebraPolyalgebra
Polyalgebra
 
Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014Querying your database in natural language by Daniel Moisset PyData SV 2014
Querying your database in natural language by Daniel Moisset PyData SV 2014
 
Quepy
QuepyQuepy
Quepy
 
Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013Visualize open data with Plone - eea.daviz PLOG 2013
Visualize open data with Plone - eea.daviz PLOG 2013
 
Tactical data engineering
Tactical data engineeringTactical data engineering
Tactical data engineering
 

Plus de semanticsconference

Linear books to open world adventure
Linear books to open world adventureLinear books to open world adventure
Linear books to open world adventuresemanticsconference
 
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
Session 1.2   high-precision, context-free entity linking exploiting unambigu...Session 1.2   high-precision, context-free entity linking exploiting unambigu...
Session 1.2 high-precision, context-free entity linking exploiting unambigu...semanticsconference
 
Session 4.3 semantic annotation for enhancing collaborative ideation
Session 4.3   semantic annotation for enhancing collaborative ideationSession 4.3   semantic annotation for enhancing collaborative ideation
Session 4.3 semantic annotation for enhancing collaborative ideationsemanticsconference
 
Session 1.1 dalicc - data licenses clearance center
Session 1.1   dalicc - data licenses clearance centerSession 1.1   dalicc - data licenses clearance center
Session 1.1 dalicc - data licenses clearance centersemanticsconference
 
Session 1.3 context information management across smart city knowledge domains
Session 1.3   context information management across smart city knowledge domainsSession 1.3   context information management across smart city knowledge domains
Session 1.3 context information management across smart city knowledge domainssemanticsconference
 
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
Session 0.0   aussenac semanticsnl-pwebsem2017-v4Session 0.0   aussenac semanticsnl-pwebsem2017-v4
Session 0.0 aussenac semanticsnl-pwebsem2017-v4semanticsconference
 
Session 0.0 keynote sandeep sacheti - final hi res
Session 0.0   keynote sandeep sacheti - final hi resSession 0.0   keynote sandeep sacheti - final hi res
Session 0.0 keynote sandeep sacheti - final hi ressemanticsconference
 
Session 1.1 linked data applied: a field report from the netherlands
Session 1.1   linked data applied: a field report from the netherlandsSession 1.1   linked data applied: a field report from the netherlands
Session 1.1 linked data applied: a field report from the netherlandssemanticsconference
 
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
Session 1.2   enrich your knowledge graphs: linked data integration with pool...Session 1.2   enrich your knowledge graphs: linked data integration with pool...
Session 1.2 enrich your knowledge graphs: linked data integration with pool...semanticsconference
 
Session 1.4 connecting information from legislation and datasets using a ca...
Session 1.4   connecting information from legislation and datasets using a ca...Session 1.4   connecting information from legislation and datasets using a ca...
Session 1.4 connecting information from legislation and datasets using a ca...semanticsconference
 
Session 1.4 a distributed network of heritage information
Session 1.4   a distributed network of heritage informationSession 1.4   a distributed network of heritage information
Session 1.4 a distributed network of heritage informationsemanticsconference
 
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
Session 0.0   media panel - matthias priem - gtuo - semantics 2017Session 0.0   media panel - matthias priem - gtuo - semantics 2017
Session 0.0 media panel - matthias priem - gtuo - semantics 2017semanticsconference
 
Session 1.3 semantic asset management in the dutch rail engineering and con...
Session 1.3   semantic asset management in the dutch rail engineering and con...Session 1.3   semantic asset management in the dutch rail engineering and con...
Session 1.3 semantic asset management in the dutch rail engineering and con...semanticsconference
 
Session 1.3 energy, smart homes &amp; smart grids: towards interoperability...
Session 1.3   energy, smart homes &amp; smart grids: towards interoperability...Session 1.3   energy, smart homes &amp; smart grids: towards interoperability...
Session 1.3 energy, smart homes &amp; smart grids: towards interoperability...semanticsconference
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichmentsemanticsconference
 
Session 2.3 semantics for safeguarding &amp; security – a police story
Session 2.3   semantics for safeguarding &amp; security – a police storySession 2.3   semantics for safeguarding &amp; security – a police story
Session 2.3 semantics for safeguarding &amp; security – a police storysemanticsconference
 
Session 2.5 semantic similarity based clustering of license excerpts for im...
Session 2.5   semantic similarity based clustering of license excerpts for im...Session 2.5   semantic similarity based clustering of license excerpts for im...
Session 2.5 semantic similarity based clustering of license excerpts for im...semanticsconference
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....semanticsconference
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...semanticsconference
 
Session 5.6 towards a semantic outlier detection framework in wireless sens...
Session 5.6   towards a semantic outlier detection framework in wireless sens...Session 5.6   towards a semantic outlier detection framework in wireless sens...
Session 5.6 towards a semantic outlier detection framework in wireless sens...semanticsconference
 

Plus de semanticsconference (20)

Linear books to open world adventure
Linear books to open world adventureLinear books to open world adventure
Linear books to open world adventure
 
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
Session 1.2   high-precision, context-free entity linking exploiting unambigu...Session 1.2   high-precision, context-free entity linking exploiting unambigu...
Session 1.2 high-precision, context-free entity linking exploiting unambigu...
 
Session 4.3 semantic annotation for enhancing collaborative ideation
Session 4.3   semantic annotation for enhancing collaborative ideationSession 4.3   semantic annotation for enhancing collaborative ideation
Session 4.3 semantic annotation for enhancing collaborative ideation
 
Session 1.1 dalicc - data licenses clearance center
Session 1.1   dalicc - data licenses clearance centerSession 1.1   dalicc - data licenses clearance center
Session 1.1 dalicc - data licenses clearance center
 
Session 1.3 context information management across smart city knowledge domains
Session 1.3   context information management across smart city knowledge domainsSession 1.3   context information management across smart city knowledge domains
Session 1.3 context information management across smart city knowledge domains
 
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
Session 0.0   aussenac semanticsnl-pwebsem2017-v4Session 0.0   aussenac semanticsnl-pwebsem2017-v4
Session 0.0 aussenac semanticsnl-pwebsem2017-v4
 
Session 0.0 keynote sandeep sacheti - final hi res
Session 0.0   keynote sandeep sacheti - final hi resSession 0.0   keynote sandeep sacheti - final hi res
Session 0.0 keynote sandeep sacheti - final hi res
 
Session 1.1 linked data applied: a field report from the netherlands
Session 1.1   linked data applied: a field report from the netherlandsSession 1.1   linked data applied: a field report from the netherlands
Session 1.1 linked data applied: a field report from the netherlands
 
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
Session 1.2   enrich your knowledge graphs: linked data integration with pool...Session 1.2   enrich your knowledge graphs: linked data integration with pool...
Session 1.2 enrich your knowledge graphs: linked data integration with pool...
 
Session 1.4 connecting information from legislation and datasets using a ca...
Session 1.4   connecting information from legislation and datasets using a ca...Session 1.4   connecting information from legislation and datasets using a ca...
Session 1.4 connecting information from legislation and datasets using a ca...
 
Session 1.4 a distributed network of heritage information
Session 1.4   a distributed network of heritage informationSession 1.4   a distributed network of heritage information
Session 1.4 a distributed network of heritage information
 
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
Session 0.0   media panel - matthias priem - gtuo - semantics 2017Session 0.0   media panel - matthias priem - gtuo - semantics 2017
Session 0.0 media panel - matthias priem - gtuo - semantics 2017
 
Session 1.3 semantic asset management in the dutch rail engineering and con...
Session 1.3   semantic asset management in the dutch rail engineering and con...Session 1.3   semantic asset management in the dutch rail engineering and con...
Session 1.3 semantic asset management in the dutch rail engineering and con...
 
Session 1.3 energy, smart homes &amp; smart grids: towards interoperability...
Session 1.3   energy, smart homes &amp; smart grids: towards interoperability...Session 1.3   energy, smart homes &amp; smart grids: towards interoperability...
Session 1.3 energy, smart homes &amp; smart grids: towards interoperability...
 
Session 1.2 improving access to digital content by semantic enrichment
Session 1.2   improving access to digital content by semantic enrichmentSession 1.2   improving access to digital content by semantic enrichment
Session 1.2 improving access to digital content by semantic enrichment
 
Session 2.3 semantics for safeguarding &amp; security – a police story
Session 2.3   semantics for safeguarding &amp; security – a police storySession 2.3   semantics for safeguarding &amp; security – a police story
Session 2.3 semantics for safeguarding &amp; security – a police story
 
Session 2.5 semantic similarity based clustering of license excerpts for im...
Session 2.5   semantic similarity based clustering of license excerpts for im...Session 2.5   semantic similarity based clustering of license excerpts for im...
Session 2.5 semantic similarity based clustering of license excerpts for im...
 
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
Session 4.2   unleash the triple: leveraging a corporate discovery interface....Session 4.2   unleash the triple: leveraging a corporate discovery interface....
Session 4.2 unleash the triple: leveraging a corporate discovery interface....
 
Session 1.6 slovak public metadata governance and management based on linke...
Session 1.6   slovak public metadata governance and management based on linke...Session 1.6   slovak public metadata governance and management based on linke...
Session 1.6 slovak public metadata governance and management based on linke...
 
Session 5.6 towards a semantic outlier detection framework in wireless sens...
Session 5.6   towards a semantic outlier detection framework in wireless sens...Session 5.6   towards a semantic outlier detection framework in wireless sens...
Session 5.6 towards a semantic outlier detection framework in wireless sens...
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Virtual data integration with just-in-time query recompilation

  • 1. Supporting virtual integration of Linked Data with just-in-time query recompilation Amsterdam, The Netherlands, September 12 2017 Alessandro Adamou1, Mathieu d’Aquin2, Carlo Allocca13, Enrico Motta1 1 Knowledge Media Institute, The Open University, UK 2 Insight Centre for Data Analytics, NUI Galway, Ireland 3 now Samsung Inc.
  • 2. Outline •  Motivation •  Just-in-time query recompilation •  Implementation •  Experiments •  Perspectives
  • 3. Virtual data integration •  No ETL process •  Naturally keeps data up-to-date •  Unlike data federation, there is still a designated node •  Favours project economies relying on networking rather than storage space •  Serious performance issues! •  Generally considered less robust •  Acquiring momentum in industry, per 2016 Gartner report •  Maintenance??
  • 4. Pay-as-you-go integration 1.  Establish mappings between source schemas and global schema (go) 2.  Obtain feedback on mapping results, e.g. in terms of precision and recall (pay) 3.  Refine 1 Really “pays off” in virtual data integration N.W. Paton, K. Christodoulou, A. A. A. Fernandes, B. Parsia, C. Hedeler: Pay-as-you-go data integra0on for linked data: opportuni0es, challenges and architectures. SWIM 2012: 3
  • 5. Outline •  Motivation •  Just-in-time query recompilation •  Implementation •  Experiments •  Perspectives
  • 6. •  Target query language other than SPARQL – Conjunctive, star-shaped queries (good for Web APIs) {hostname}{/attribute/value}+? {attribute}{&attribute}+ e.g. http://example.org/api/type/actor/ name/clint_eastwood?filmwork
  • 7. type/actor/name/clint_eastwood SELECT DISTINCT ?filmwork WHERE {
 { dbr:Clint_Eastwood a dbo:Actor ; ^(dbo:director|dbo:starring) ?filmwork } UNION { { ?x owl:sameAs dbr:Clint_Eastwood } UNION { ?x (movie:actor_name|movie:director_name) ?s FILTER (str(?s) = "Clint Eastwood")
 } . ?x foaf:made|^(movie:director|movie:actor) ?filmwork }} Equivalent SPARQL query for federated engine that supports DBpedia and LinkedMDB
  • 8. type/actor/name/clint_eastwood SELECT DISTINCT ?filmwork WHERE {
 { dbr:Clint_Eastwood a dbo:Actor ; ^(dbo:director|dbo:starring) ?filmwork } UNION { { ?x owl:sameAs dbr:Clint_Eastwood } UNION { ?x (movie:actor_name|movie:director_name) ?s FILTER (str(?s) = "Clint Eastwood")
 } . ?x foaf:made|^(movie:director|movie:actor) ?filmwork }} Equivalent SPARQL query for federated engine that supports DBpedia and LinkedMDB
  • 9. type/actor/name/clint_eastwood SELECT DISTINCT ?filmwork WHERE {
 { dbr:Clint_Eastwood a dbo:Actor ; ^(dbo:director|dbo:starring) ?filmwork } UNION { { ?x owl:sameAs dbr:Clint_Eastwood } UNION { ?x (movie:actor_name|movie:director_name) ?s FILTER (str(?s) = "Clint Eastwood")
 } . ?x foaf:made|^(movie:director|movie:actor) ?filmwork }} Equivalent SPARQL query for federated engine that supports DBpedia and LinkedMDB
  • 10. type/actor/name/clint_eastwood SELECT DISTINCT ?filmwork ?eq WHERE { VALUES(?x) { ( dbr:Clint_Eastwood ) ( dbr:Clint_Eastwood_(actor) ) } ?x a dbo:Actor ; ^(dbo:director|dbo:starring) ?filmwork . ?filmwork owl:sameAs|^owl:sameAs ?eq } Alternative approach: DBpedia
  • 11. type/actor/name/clint_eastwood SELECT DISTINCT ?filmwork ?eq WHERE { { VALUES(?x0) { ( dbr:Clint_Eastwood ) ( dbr:Clint_Eastwood_(actor) ) } ?x0 ^owl:sameAs ?x } UNION { ?x movie:actor_name "Clint Eastwood" } UNION { ?x movie:director_name "Clint Eastwood" } . { { ?x foaf:made ?filmwork } UNION { ?filmwork movie:director ?x } UNION { ?filmwork movie:actor ?x }
 } . ?filmwork owl:sameAs|^owl:sameAs ?eq } Alternative approach: LinkedMDB
  • 12. •  Encode integrator’s knowledge of a dataset schema into a set of primitives, which will serve as “compilation units”. •  Managing the compilation units of a query using two types of structure: – Microcompilers – Query skeletons (or templates)
  • 13. Microcompiler Let W be the set of all the a;ribute-value pairs and Σ the alphabet of a language; a microcompiler is a func0on φ : ℘W → Σ∗ that transforms sets of a;ribute-value pairs into a sequence of symbols in that language.
  • 14. Microcompiler (JS ex.) mc_x_dbp = function (type,name) { var pref = ‘http://dbpedia.org/resource/’; var idd = name.replace(/b[a-z]/g, function(f){return f.toUpperCase()}); return ‘VALUES(?x_dbp){ ‘ + ‘( <’ + pref + idd + ‘> )’ + ‘( <’ + pref + idd + ‘_(’ + type + ‘)> )’ + ‘} ?x_dbp’ } type/actor/name/clint_eastwood VALUES(?x_dbp) { ( <http://dbpedia.org/resource/Clint_Eastwood> ) ( <http://dbpedia.org/resource/Clint_Eastwood_(actor)> ) } ?x_dbp
  • 15. Microcompiler (JS ex. II) mc_x_lmdb = function (type,name) { if( [‘actor’,’director’].indexof(type) >= 0 ) { var sa = mc_x_dbp(type,name) + ‘ ^owl:sameAs ?x_lmdb’; var nam = makename(name); // omitted for simplicity return ‘{ ‘ + sa + ’ } UNION ’ + ‘{ ?x_lmdb movie:actor_name “’+nam+’” } UNION’ + ‘{ ?x_lmdb movie:director_name “’+nam+’” }’ //… }} type/actor/name/clint_eastwood { VALUES(?x_dbp) { ( <http://dbpedia.org/resource/Clint_Eastwood> ) ( <http://dbpedia.org/resource/Clint_Eastwood_(actor)> ) } ?x_dbp ^owl:sameAs ?x_lmdb } UNION {?x_lmdb movie:actor_name “Clint Eastwood” } UNION {?x_lmdb movie:director_name “Clint Eastwood” }
  • 16. Query skeleton A query skeleton, or query template, t is a member of (Σ∪C)∗, where C is an alphabet called set of control symbols. <[name]> ^(dbo:director|dbo:starring) ?[filmwork]? { { <[name]> foaf:made ?[filmwork]? } UNION { ?[filmwork]? movie:director ?x_lmdb } UNION { ?[filmwork]? movie:actor ?x_lmdb }
 } . ?[filmwork]? owl:sameAs|^owl:sameAs ?eq Example II (LinkedMDB, {filmwork,name} Example I (DBpedia, {filmwork,name}
  • 18. Compilation strategies •  A manifest is a pair of sets of microcompilers and query skeletons •  Grouping into manifests for: – (a) data sources; – (b) entity types •  Data source selection algorithm produces a set of datasource-query pairs by finding satisfiable query skeletons (on paper)
  • 19. Outline •  Motivation •  Just-in-time query recompilation •  Implementation •  Experiments •  Perspectives
  • 20. M.d'Aquin, A. Adamou, E. Daga, S. Liu, K. Thomas, E. MoTa: Dealing with Diversity in a Smart-City Datahub. SemanLcs for Smarter CiLes @ISWC 2014: 68-82 Big Data for Milton Keynes as a Smart City EnLty-centric data API based on a simplified language from the one of this presentaLon •  hTps://datahub.mksmart.org •  hTps://github.com/mk-smart/enLty-centric-api
  • 21. Implementation •  Reference open source implementation written in Java –  With support for SPARQL and HTTP dereferencing of RDF –  Includes JIT logic, custom experimental VDIS and HTTP API •  Accepts microcompilers in JavaScript •  Apache CouchDB map-reduce for atomically retrieving candidate compilation units
  • 22. Outline •  Motivation •  Just-in-time query recompilation •  Implementation •  Experiments •  Perspectives
  • 23. Experiments What is the price paid to turn a federated query engine into a virtual data integration system using JIT recompilation?
  • 24. Experiments 1.  Benchmark of FedBench1 queries translated into our target language 2.  Take a federated query engine (FedX)2 3.  Measure the time taken by FedX to execute the original FedBench SPARQL query –  On the live endpoints whenever possible 4.  Take the translated query and recompile them into one or more SPARQL queries (at most one per data source) –  Execute each query with FedX 5.  Measure for each: –  Increase in size of “correct” result set –  Recompilation overhead –  Overall turnaround time of queries 1 hTp://fedbench.fluidops.net 2 hTps://www.fluidops.com/en/company/knowledge/open_source
  • 25. Experiments Example: FedBench Cross-domain CD3 CD3 (original) SELECT ?pres ?party ?page WHERE { ?pres rdf:type dbpedia-owl:President . ?pres dbpedia-owl:nationality dbpedia:United_States . ?pres dbpedia-owl:party ?party . ?x nytimes:topicPage ?page . ?x owl:sameAs ?pres } CD3C: type/president/country/united_states?party&webpage
  • 26. Results I Query Result set VDI boost Notes FedBench Cross-Domain CD1C m * 1.387 CD2C 52 new results Plain FedX yielded no results CD3C 67 new results Plain FedX yielded no results, has SERVICE clause CD4C m * 4480.0 Some microcompilers perform queries CD5C m * 1.0 No increment from recompilaLon FedBench Life Sciences LS1C m * 1.0 Query could not be expanded LS2C m * 1.0 No increment from recompilaLon LS3C 70981 results Plain FedX crashed FedBench Linked Data LD5C m ∗ 3.677 LD9C 4 new results Plain FedX yielded no results LD10C m * 17.0 LD11C m * 1.65
  • 27. Results II Query Time (ms) - FedX Time (ms) – FedX+JIT JIT overhead Query TAT FedBench Cross-Domain CD1C 300 ± 050 420 ± 109 400 ± 020 800 ± 120 CD2C 175 ± 005 475 ± 055 432 ± 009 1500 ± 123 CD3C 158 ± 004 446 ± 076 408 ± 106 1067 ± 048 CD4C 8835 ± 954 420 ± 100 787 ± 165 7480 ± 569 CD5C 851 ± 319 519 ± 145 448 ± 031 548 ± 061 FedBench Life Sciences LS1C 795 ± 371 892 ± 043 query could not be expanded LS2C 484 ± 166 420 ± 100 444 ± 061 370 ± 061 LS3C !ERROR 6653 ± 861 query could not be expanded FedBench Linked Data LD5C 795 ± 371 801 ± 078 486 ± 017 1028 ± 099 LD9C 484 ± 166 407 ± 023 390 ± 039 318 ± 061 LD10C 189 ± 036 440 ± 018 416 ± 017 658 ± 101 LD11C 387 ± 067 861 ± 057 406 ± 020 762 ± 095
  • 28. Outline •  Motivation •  Just-in-time query recompilation •  Implementation •  Experiments •  Perspectives
  • 29. Discussion •  Can compile star-shaped input queries into more complex target queries •  Overhead is mostly a standard cost •  Proves to be mostly efficient when also effective (i.e. there is query expansion) •  Cannot still substitute query federation optimisation strategies •  Manageability? We knew exactly how to proceed… –  However we worked with ~ |A| · |MS| + |MT| microcompilers and query skeletons, where it could have been up to |MT + A| · |MS| + |MT|
  • 30. Future work •  Optimisations to abate JIT overhead •  Application to chain-shaped queries and other query types •  Investigate other target languages •  Investigate templating languages for query skeletons •  Cascaded mappings applied at query time (no knowledge of dataset content or structure)
  • 31. Thank You Amsterdam, The Netherlands, September 12 2017 Alessandro Adamou1, Mathieu d’Aquin2, Carlo Allocca13, Enrico Motta1 1 Knowledge Media Institute, The Open University, UK 2 Insight Centre for Data Analytics, NUI Galway, Ireland 3 now Samsung Inc.