SlideShare une entreprise Scribd logo
1  sur  29
September 13th, 2018
COSTFED: COST-BASED QUERY
OPTIMIZATION FOR SPARQL
ENDPOINT FEDERATION
Muhammad Saleem, Alexander Potocki, Tommaso Soru,
Olaf Hartig, Axel-Cyrille Ngonga Ngomo
Semantics 2018, Vienna, Austria
1
WHAT IS COSTFED?
 Federated SPARQL query processing engine
 Federation over multiple SPARQL endpoints
 Index-assisted
 Join-aware source selection
 Cost-based query planner
2
COSTFED QUERY PROCESSING
3
Endpoint
1
Endpoint
2
Endpoint
3
Endpoint
4
RDF RDF RDF RDF
Parsing
Source
Selection
Federat
or
Optimz
er
Integrator
Rewrite query
and get
Individual
Triple Patterns
Identify
capable source
against
Individual
Triple Patterns
Generate
optimized
sub-query
Exe. Plan
Integrate sub-
queries results
Execute sub-
queries
Index
CostFe
d
STATE-OF-THE-ART
 FedX
 SPLENDID
 ANAPSID
 ODYSSEY
 LHD
 LUSAIL
 MULDER
 SemaGrow
4
How CostFed is different?
HOW COSTFED IS DIFFERENT?
 Skew distribution of resources
 Construction of buckets
 Join-aware source selection using prefixes
 Effect of multi-valued predicates
 Cost-based query planning
5
SKEW DISTRIBUTION OF
RESOURCES AND CONSTRUCTION
OF BUCKETS
6
 Store each resource along
with its cardinality from
bucket bo (brown)
 Store each resource along
with the avg. cardinality of all
the resources in bucket b1
(black)
 Only store the avg.
cardinality of all the resources
in bucket b2 (blue)
USING PREFIXES AND TRIE DATA
STRUCTURE
<wiwiss.fu-berlin.de/drugbank/resource/references/1002129>
<wiwiss.fu-berlin.de/drugbank/resource/drugs/DB00201>
7
We used character-by-
character insertion in Trie
COSTFED INDEX
8
Predicate as
capability
Bucket bo:
Subjects resources
along with its
cardinality from
Bucket b1:
Subjects resources
along with their avg.
cardinality
Bucket b2:
Only store the
avg. selectivity
of all the
subjects and
objects
Subjects and
objects prefixes
Bucket bo:
Objects resources
along with its
cardinality from
Bucket b1:
Objects
resources along
with their avg.
cardinality
JOIN-AWARE SOURCE SELECTION
USING PREFIXES
9
Model SPARQL queries as
directed hyper graphs
10
JOIN-AWARE
SOURCE SELECTION
ALGORITHM
EFFECT OF MULTIVALUED
PREDICATES
11
 Cardinality (tp1) = 4
 Cardinality (tp2) = 2
Cardinality (tp1⋈tp2) = ?
 SemaGrow: Min(Cardinality (tp1) , Cardinality (tp2))
= Min (4,2) => 2
CostFed: M(tp1)×M(tp2) × Min(Cardinality (tp1) , Cardinality (tp2))
: (4/2) )×(2/1)× Min(4,2)
: 2×2×2 => 8
Actual cardinality: 6
QUERY PLANNING
12
TRIPLE PATTERN CARDINALITY
ESTIMATION
13
• T(p,D) is the total number of
triples with predicate p in
dataset D
• avgSS(p,D), avgOS(p,D) are the
average subject resp. object
selectivities of p in D in the
corresponding bucket
• tS(D), tO(D) total number of
distinct subjects resp. distinct
objects in D
• tT(D) total number of triples in D
• R(tp) set of all relevant sources
for tp
• b stands for bound
JOIN CARDINALITY ESTIMATION
14
M (B) is the average frequency of multivalued predicate or BGP.
C(B) cardinality of B. j(s), j(o) means join based on subject resp.
object of the triple pattern tp.
⋈
⋈
𝜋
B1=tp1 B2=tp2
B4=tp3
B3
JOIN-COST ESTIMATION: HASH
JOIN
15
Cost of receiving
the highest
cardinality BGP
results, i.e. a
triple pattern
Cost of
intersecting the
results of both of
the BGPs
TC = 2 number of threads used, CSQ = 100 Cost of
sending a SPARQL query CRT = 0.01 Cost of receiving a
single result tuple. CHT = 0.0025 Cost of intersecting a
single result with another result
JOIN-COST ESTIMATION: BIND JOIN
16
CSQ = 100 Cost of sending a SPARQL query CRT =
0.01Cost of receiving a single result tuple. BSZ = 20
binding block size, CTC = 20 number of threads used
Cost of receiving
the smallest
cardinality BGP
results, i.e. a
triple pattern
Cost of binding
and sending
binded results as
SPARQL queries
JOIN ORDERING
17
EVALUATION SETUP
 Benchmarks
 FedBench (9 datasets)
 LargeRDFBENCH (13 datasets)
 Metrics
 Index compression ratio (1-index
size/total size)
 Index generation time
 Total number of triple-pattern wise
sources selected
 Number of ASK request used during the
source selection
 Source selection time
 Query execution time
18
 Federation engines
 FedX
 ANAPSID
 SPLENDID
 SemaGrow
 HiBISCuS
INDEX COMPRESSION RATIO
 CostFed 99.99 %
 HiBISCuS 99.99 %
 SPLENDID 99.99 %
 ANAPSID 99.99 %
 SemaGrow 99.99 %
19
INDEX GENERATION TIME
 CostFed’s 60 min
 HiBISCuS 41 min
 SPLENDID 110 min
 ANAPSID 5 min
 SemaGrow 110 min
20
RESULTS ON LARGERDFBENCH
(SPARQL1.0)
21
EVALUATION RESULTS: NUMBER OF
SOURCES SELECTED
FedBench
 CostFed 70
 FedX 134
 HiBISCuS 76
 SPLENDID 134
 ANAPSID 80
SemaGrow 134
22
LargeRDFBench
 CostFed 104
 FedX 199
 HiBISCuS 106
 SPLENDID 199
 ANAPSID 111
 SemaGrow 199
EVALUATION RESULTS: NUMBER OF
ASK REQUESTS
FedBench
 CostFed 45
 FedX 549
 HiBISCuS 45
 SPLENDID 77
 ANAPSID 89
SemaGrow 77
23
LargeRDFBench
 CostFed 0
 FedX 1196
 HiBISCuS 0
 SPLENDID 11
 ANAPSID 103
 SemaGrow 11
EVALUATION RESULTS: AVG. SOURCE
SELECTION TIME
FedBench
 CostFed 1.7 ms
 FedX 3 ms (warm), 302
(cold)
 HiBISCuS 137 ms
 SPLENDID 46 ms
 ANAPSID 463 ms
SemaGrow 46 ms
24
LargeRDFBench
 CostFed 1 ms
 FedX 500 ms (warm), 4
(cold)
 HiBISCuS 154.7 ms
 SPLENDID 7.8 ms
 ANAPSID 33.3 ms
 SemaGrow 7.8 ms
FEDBENCH QUERY RUNTIMES
25
1
10
100
1000
10000
100000
1000000
CD1 CD2 CD3 CD4 CD5 CD6 CD7 LS1 LS2 LS3 LS4 LS5 LS6 LS7 Avg.
Runtimeinmsec(logscale)
FedX SPLENDID ANAPSID SemaGrow CostFed
• Ranked 1st in 11/14
queries
• 3 times faster than
SemaGrow
• 17 times faster than FedX
• 28 times faster than
ANAPSID
• 121 times faster than
SPLENDID
LARGERDFBENCH QUERY RUNTIMES
26
• Ranked 1st in 8/10 queries
• 3 times faster than
SemaGrow
• 2 times faster than FedX
• 1.20 times faster than
ANAPSID
• 1.73 times faster than
SPLENDID
• Missing bar indicates a
1
10
100
1000
10000
100000
1000000
10000000
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 Avg.
Runtimeinmsec(logscale)
FedX SPLENDID ANAPSID SemaGrow CostFed
RESULTS ON LARGERDFBENCH
(SPARQL1.1)
27
• Ranked 1st in 12/14
queries
• 1.7 times faster than
SemaGrow
• 2.71 times faster than FedX
• 7.34 times faster than
ANAPSID
• SPLENDID does not support
SPARQL SERVICE clause
RESULTS ON LARGERDFBENCH
(SPARQL1.1)
28
• Ranked 1st in 6/9 queries
• 1.19 times faster than
SemaGrow
• 1.13 times faster than FedX
• 1.19 times faster than
ANAPSID
• SPLENDID does not support
SPARQL SERVICE clause
Thanks!
Live endpoint
http://costfed.aksw.org/
Code https://github.com/dice-
group/CostFed
29

Contenu connexe

Tendances

Time Series Processing with Solr and Spark
Time Series Processing with Solr and SparkTime Series Processing with Solr and Spark
Time Series Processing with Solr and SparkJosef Adersberger
 
InfluxData Platform Future and Vision
InfluxData Platform Future and VisionInfluxData Platform Future and Vision
InfluxData Platform Future and VisionInfluxData
 
ESDIS Metadata Archive
ESDIS Metadata ArchiveESDIS Metadata Archive
ESDIS Metadata ArchiveTed Habermann
 
Cloud flare jgc bigo meetup rolling hashes
Cloud flare jgc   bigo meetup rolling hashesCloud flare jgc   bigo meetup rolling hashes
Cloud flare jgc bigo meetup rolling hashesCloudflare
 
Metadata For Humans and Machines
Metadata For Humans and MachinesMetadata For Humans and Machines
Metadata For Humans and MachinesTed Habermann
 
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?bzamecnik
 
Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019Aerospike
 
Go and Uber’s time series database m3
Go and Uber’s time series database m3Go and Uber’s time series database m3
Go and Uber’s time series database m3Rob Skillington
 
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsThe HDF-EOS Tools and Information Center
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterpriseInfluxData
 
Flux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul DixFlux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul DixInfluxData
 
Pig and Pig Latin - Module 5
Pig and Pig Latin - Module 5Pig and Pig Latin - Module 5
Pig and Pig Latin - Module 5Rohit Agrawal
 
MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...
MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...
MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...MongoDB
 
Schema Design by Chad Tindel, Solution Architect, 10gen
Schema Design  by Chad Tindel, Solution Architect, 10genSchema Design  by Chad Tindel, Solution Architect, 10gen
Schema Design by Chad Tindel, Solution Architect, 10genMongoDB
 

Tendances (19)

Time Series Processing with Solr and Spark
Time Series Processing with Solr and SparkTime Series Processing with Solr and Spark
Time Series Processing with Solr and Spark
 
InfluxData Platform Future and Vision
InfluxData Platform Future and VisionInfluxData Platform Future and Vision
InfluxData Platform Future and Vision
 
Efficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAPEfficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAP
 
ESDIS Metadata Archive
ESDIS Metadata ArchiveESDIS Metadata Archive
ESDIS Metadata Archive
 
Cloud flare jgc bigo meetup rolling hashes
Cloud flare jgc   bigo meetup rolling hashesCloud flare jgc   bigo meetup rolling hashes
Cloud flare jgc bigo meetup rolling hashes
 
Metadata For Humans and Machines
Metadata For Humans and MachinesMetadata For Humans and Machines
Metadata For Humans and Machines
 
HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?HyperLogLog in Hive - How to count sheep efficiently?
HyperLogLog in Hive - How to count sheep efficiently?
 
Supporting HDF5 in GrADS
Supporting HDF5 in GrADSSupporting HDF5 in GrADS
Supporting HDF5 in GrADS
 
The new time series kid on the block
The new time series kid on the blockThe new time series kid on the block
The new time series kid on the block
 
Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019
 
Go and Uber’s time series database m3
Go and Uber’s time series database m3Go and Uber’s time series database m3
Go and Uber’s time series database m3
 
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
 
Monitoring InfluxEnterprise
Monitoring InfluxEnterpriseMonitoring InfluxEnterprise
Monitoring InfluxEnterprise
 
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited DimensionHDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
 
Flux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul DixFlux and InfluxDB 2.0 by Paul Dix
Flux and InfluxDB 2.0 by Paul Dix
 
Visualizing and Analyzing HDF-EOS5 and HDF5 data with NCL
Visualizing and Analyzing HDF-EOS5 and HDF5 data with NCLVisualizing and Analyzing HDF-EOS5 and HDF5 data with NCL
Visualizing and Analyzing HDF-EOS5 and HDF5 data with NCL
 
Pig and Pig Latin - Module 5
Pig and Pig Latin - Module 5Pig and Pig Latin - Module 5
Pig and Pig Latin - Module 5
 
MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...
MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...
MongoDB World 2019: Creating a Self-healing MongoDB Replica Set on GCP Comput...
 
Schema Design by Chad Tindel, Solution Architect, 10gen
Schema Design  by Chad Tindel, Solution Architect, 10genSchema Design  by Chad Tindel, Solution Architect, 10gen
Schema Design by Chad Tindel, Solution Architect, 10gen
 

Similaire à CostFed: Cost-Based Query Optimization for SPARQL Endpoint Federation

Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsRuben Verborgh
 
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey GusevImage Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey GusevDatabricks
 
What is the best full text search engine for Python?
What is the best full text search engine for Python?What is the best full text search engine for Python?
What is the best full text search engine for Python?Andrii Soldatenko
 
Topic-based Federator Query Engine - Presented at ICWI Budapest 2018
Topic-based Federator Query Engine - Presented at ICWI Budapest 2018Topic-based Federator Query Engine - Presented at ICWI Budapest 2018
Topic-based Federator Query Engine - Presented at ICWI Budapest 2018Ciro Sorrentino
 
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF SourcesMore Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF SourcesAndré Valdestilhas
 
Scalding Big (Ad)ta
Scalding Big (Ad)taScalding Big (Ad)ta
Scalding Big (Ad)tab0ris_1
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSAmazon Web Services
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysGoon83
 
ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.Tatiana Tarasova
 
Optimizing QoE and Latency of Live Video Streaming Using Edge Computing a...
Optimizing  QoE and Latency of  Live Video Streaming Using  Edge Computing  a...Optimizing  QoE and Latency of  Live Video Streaming Using  Edge Computing  a...
Optimizing QoE and Latency of Live Video Streaming Using Edge Computing a...Alpen-Adria-Universität
 
Session 1.5 supporting virtual integration of linked data with just-in-time...
Session 1.5   supporting virtual integration of linked data with just-in-time...Session 1.5   supporting virtual integration of linked data with just-in-time...
Session 1.5 supporting virtual integration of linked data with just-in-time...semanticsconference
 
Report on the CLEF-IP 2012 Experiments: Search of Topically Organized Patents
Report on the CLEF-IP 2012 Experiments: Search of Topically Organized PatentsReport on the CLEF-IP 2012 Experiments: Search of Topically Organized Patents
Report on the CLEF-IP 2012 Experiments: Search of Topically Organized PatentsMike Salampasis
 
Data correlation using PySpark and HDFS
Data correlation using PySpark and HDFSData correlation using PySpark and HDFS
Data correlation using PySpark and HDFSJohn Conley
 
Filtering 100M objects in Coherence cache. What can go wrong?
Filtering 100M objects in Coherence cache. What can go wrong?Filtering 100M objects in Coherence cache. What can go wrong?
Filtering 100M objects in Coherence cache. What can go wrong?aragozin
 
Set Similarity Search using a Distributed Prefix Tree Index
Set Similarity Search using a Distributed Prefix Tree IndexSet Similarity Search using a Distributed Prefix Tree Index
Set Similarity Search using a Distributed Prefix Tree IndexHPCC Systems
 

Similaire à CostFed: Cost-Based Query Optimization for SPARQL Endpoint Federation (20)

Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
 
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey GusevImage Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
Image Similarity Detection at Scale Using LSH and Tensorflow with Andrey Gusev
 
What is the best full text search engine for Python?
What is the best full text search engine for Python?What is the best full text search engine for Python?
What is the best full text search engine for Python?
 
Topic-based Federator Query Engine - Presented at ICWI Budapest 2018
Topic-based Federator Query Engine - Presented at ICWI Budapest 2018Topic-based Federator Query Engine - Presented at ICWI Budapest 2018
Topic-based Federator Query Engine - Presented at ICWI Budapest 2018
 
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF SourcesMore Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
 
Scalding Big (Ad)ta
Scalding Big (Ad)taScalding Big (Ad)ta
Scalding Big (Ad)ta
 
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federationLargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
LargeRDFBench: A billion triples benchmark for SPARQL endpoint federation
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Team3 presentation
Team3 presentationTeam3 presentation
Team3 presentation
 
Deploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWSDeploying your Data Warehouse on AWS
Deploying your Data Warehouse on AWS
 
LargeRDFBench
LargeRDFBenchLargeRDFBench
LargeRDFBench
 
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on ArraysArrayUDF: User-Defined Scientific Data Analysis on Arrays
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
 
ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.ParlBench: a SPARQL-benchmark for electronic publishing applications.
ParlBench: a SPARQL-benchmark for electronic publishing applications.
 
Optimizing QoE and Latency of Live Video Streaming Using Edge Computing a...
Optimizing  QoE and Latency of  Live Video Streaming Using  Edge Computing  a...Optimizing  QoE and Latency of  Live Video Streaming Using  Edge Computing  a...
Optimizing QoE and Latency of Live Video Streaming Using Edge Computing a...
 
Introduction P2p
Introduction P2pIntroduction P2p
Introduction P2p
 
Session 1.5 supporting virtual integration of linked data with just-in-time...
Session 1.5   supporting virtual integration of linked data with just-in-time...Session 1.5   supporting virtual integration of linked data with just-in-time...
Session 1.5 supporting virtual integration of linked data with just-in-time...
 
Report on the CLEF-IP 2012 Experiments: Search of Topically Organized Patents
Report on the CLEF-IP 2012 Experiments: Search of Topically Organized PatentsReport on the CLEF-IP 2012 Experiments: Search of Topically Organized Patents
Report on the CLEF-IP 2012 Experiments: Search of Topically Organized Patents
 
Data correlation using PySpark and HDFS
Data correlation using PySpark and HDFSData correlation using PySpark and HDFS
Data correlation using PySpark and HDFS
 
Filtering 100M objects in Coherence cache. What can go wrong?
Filtering 100M objects in Coherence cache. What can go wrong?Filtering 100M objects in Coherence cache. What can go wrong?
Filtering 100M objects in Coherence cache. What can go wrong?
 
Set Similarity Search using a Distributed Prefix Tree Index
Set Similarity Search using a Distributed Prefix Tree IndexSet Similarity Search using a Distributed Prefix Tree Index
Set Similarity Search using a Distributed Prefix Tree Index
 

Plus de Muhammad Saleem

QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...
QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...
QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...Muhammad Saleem
 
How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benc...
How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benc...How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benc...
How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benc...Muhammad Saleem
 
SQCFramework: SPARQL Query containment Benchmark Generation Framework
SQCFramework: SPARQL Query containment  Benchmark Generation Framework SQCFramework: SPARQL Query containment  Benchmark Generation Framework
SQCFramework: SPARQL Query containment Benchmark Generation Framework Muhammad Saleem
 
Question Answering Over Linked Data: What is Difficult to Answer? What Affect...
Question Answering Over Linked Data: What is Difficult to Answer? What Affect...Question Answering Over Linked Data: What is Difficult to Answer? What Affect...
Question Answering Over Linked Data: What is Difficult to Answer? What Affect...Muhammad Saleem
 
Federated Query Formulation and Processing Through BioFed
Federated Query Formulation and Processing Through BioFedFederated Query Formulation and Processing Through BioFed
Federated Query Formulation and Processing Through BioFedMuhammad Saleem
 
Fine-grained Evaluation of SPARQL Endpoint Federation Systems
Fine-grained Evaluation of SPARQL Endpoint Federation SystemsFine-grained Evaluation of SPARQL Endpoint Federation Systems
Fine-grained Evaluation of SPARQL Endpoint Federation SystemsMuhammad Saleem
 
SPARQL Querying Benchmarks ISWC2016
SPARQL Querying Benchmarks ISWC2016SPARQL Querying Benchmarks ISWC2016
SPARQL Querying Benchmarks ISWC2016Muhammad Saleem
 
Efficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federationEfficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federationMuhammad Saleem
 
LSQ: The Linked SPARQL Queries Dataset
LSQ: The Linked SPARQL Queries DatasetLSQ: The Linked SPARQL Queries Dataset
LSQ: The Linked SPARQL Queries DatasetMuhammad Saleem
 
FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015Muhammad Saleem
 
Federated SPARQL Query Processing ISWC2015 Tutorial
Federated SPARQL Query Processing ISWC2015 TutorialFederated SPARQL Query Processing ISWC2015 Tutorial
Federated SPARQL Query Processing ISWC2015 TutorialMuhammad Saleem
 
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data CubesSAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data CubesMuhammad Saleem
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataMuhammad Saleem
 
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint FederationHiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint FederationMuhammad Saleem
 
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of DataDAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of DataMuhammad Saleem
 
Fostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked DataFostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked DataMuhammad Saleem
 
Linked Cancer Genome Atlas Database
Linked Cancer Genome Atlas DatabaseLinked Cancer Genome Atlas Database
Linked Cancer Genome Atlas DatabaseMuhammad Saleem
 

Plus de Muhammad Saleem (18)

QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...
QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...
QaldGen: Towards Microbenchmarking of Question Answering Systems Over Knowled...
 
How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benc...
How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benc...How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benc...
How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benc...
 
Extended LargeRDFBench
Extended LargeRDFBenchExtended LargeRDFBench
Extended LargeRDFBench
 
SQCFramework: SPARQL Query containment Benchmark Generation Framework
SQCFramework: SPARQL Query containment  Benchmark Generation Framework SQCFramework: SPARQL Query containment  Benchmark Generation Framework
SQCFramework: SPARQL Query containment Benchmark Generation Framework
 
Question Answering Over Linked Data: What is Difficult to Answer? What Affect...
Question Answering Over Linked Data: What is Difficult to Answer? What Affect...Question Answering Over Linked Data: What is Difficult to Answer? What Affect...
Question Answering Over Linked Data: What is Difficult to Answer? What Affect...
 
Federated Query Formulation and Processing Through BioFed
Federated Query Formulation and Processing Through BioFedFederated Query Formulation and Processing Through BioFed
Federated Query Formulation and Processing Through BioFed
 
Fine-grained Evaluation of SPARQL Endpoint Federation Systems
Fine-grained Evaluation of SPARQL Endpoint Federation SystemsFine-grained Evaluation of SPARQL Endpoint Federation Systems
Fine-grained Evaluation of SPARQL Endpoint Federation Systems
 
SPARQL Querying Benchmarks ISWC2016
SPARQL Querying Benchmarks ISWC2016SPARQL Querying Benchmarks ISWC2016
SPARQL Querying Benchmarks ISWC2016
 
Efficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federationEfficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federation
 
LSQ: The Linked SPARQL Queries Dataset
LSQ: The Linked SPARQL Queries DatasetLSQ: The Linked SPARQL Queries Dataset
LSQ: The Linked SPARQL Queries Dataset
 
FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015FEASIBLE-Benchmark-Framework-ISWC2015
FEASIBLE-Benchmark-Framework-ISWC2015
 
Federated SPARQL Query Processing ISWC2015 Tutorial
Federated SPARQL Query Processing ISWC2015 TutorialFederated SPARQL Query Processing ISWC2015 Tutorial
Federated SPARQL Query Processing ISWC2015 Tutorial
 
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data CubesSAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
 
Federated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of DataFederated SPARQL query processing over the Web of Data
Federated SPARQL query processing over the Web of Data
 
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint FederationHiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation
HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation
 
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of DataDAW: Duplicate-AWare Federated Query Processing over the Web of Data
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
 
Fostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked DataFostering Serendipity through Big Linked Data
Fostering Serendipity through Big Linked Data
 
Linked Cancer Genome Atlas Database
Linked Cancer Genome Atlas DatabaseLinked Cancer Genome Atlas Database
Linked Cancer Genome Atlas Database
 

Dernier

Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
lifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptxlifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptxsomshekarkn64
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringJuanCarlosMorales19600
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptJasonTagapanGulla
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 

Dernier (20)

Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
lifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptxlifi-technology with integration of IOT.pptx
lifi-technology with integration of IOT.pptx
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineering
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Solving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.pptSolving The Right Triangles PowerPoint 2.ppt
Solving The Right Triangles PowerPoint 2.ppt
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 

CostFed: Cost-Based Query Optimization for SPARQL Endpoint Federation

  • 1. September 13th, 2018 COSTFED: COST-BASED QUERY OPTIMIZATION FOR SPARQL ENDPOINT FEDERATION Muhammad Saleem, Alexander Potocki, Tommaso Soru, Olaf Hartig, Axel-Cyrille Ngonga Ngomo Semantics 2018, Vienna, Austria 1
  • 2. WHAT IS COSTFED?  Federated SPARQL query processing engine  Federation over multiple SPARQL endpoints  Index-assisted  Join-aware source selection  Cost-based query planner 2
  • 3. COSTFED QUERY PROCESSING 3 Endpoint 1 Endpoint 2 Endpoint 3 Endpoint 4 RDF RDF RDF RDF Parsing Source Selection Federat or Optimz er Integrator Rewrite query and get Individual Triple Patterns Identify capable source against Individual Triple Patterns Generate optimized sub-query Exe. Plan Integrate sub- queries results Execute sub- queries Index CostFe d
  • 4. STATE-OF-THE-ART  FedX  SPLENDID  ANAPSID  ODYSSEY  LHD  LUSAIL  MULDER  SemaGrow 4 How CostFed is different?
  • 5. HOW COSTFED IS DIFFERENT?  Skew distribution of resources  Construction of buckets  Join-aware source selection using prefixes  Effect of multi-valued predicates  Cost-based query planning 5
  • 6. SKEW DISTRIBUTION OF RESOURCES AND CONSTRUCTION OF BUCKETS 6  Store each resource along with its cardinality from bucket bo (brown)  Store each resource along with the avg. cardinality of all the resources in bucket b1 (black)  Only store the avg. cardinality of all the resources in bucket b2 (blue)
  • 7. USING PREFIXES AND TRIE DATA STRUCTURE <wiwiss.fu-berlin.de/drugbank/resource/references/1002129> <wiwiss.fu-berlin.de/drugbank/resource/drugs/DB00201> 7 We used character-by- character insertion in Trie
  • 8. COSTFED INDEX 8 Predicate as capability Bucket bo: Subjects resources along with its cardinality from Bucket b1: Subjects resources along with their avg. cardinality Bucket b2: Only store the avg. selectivity of all the subjects and objects Subjects and objects prefixes Bucket bo: Objects resources along with its cardinality from Bucket b1: Objects resources along with their avg. cardinality
  • 9. JOIN-AWARE SOURCE SELECTION USING PREFIXES 9 Model SPARQL queries as directed hyper graphs
  • 11. EFFECT OF MULTIVALUED PREDICATES 11  Cardinality (tp1) = 4  Cardinality (tp2) = 2 Cardinality (tp1⋈tp2) = ?  SemaGrow: Min(Cardinality (tp1) , Cardinality (tp2)) = Min (4,2) => 2 CostFed: M(tp1)×M(tp2) × Min(Cardinality (tp1) , Cardinality (tp2)) : (4/2) )×(2/1)× Min(4,2) : 2×2×2 => 8 Actual cardinality: 6
  • 13. TRIPLE PATTERN CARDINALITY ESTIMATION 13 • T(p,D) is the total number of triples with predicate p in dataset D • avgSS(p,D), avgOS(p,D) are the average subject resp. object selectivities of p in D in the corresponding bucket • tS(D), tO(D) total number of distinct subjects resp. distinct objects in D • tT(D) total number of triples in D • R(tp) set of all relevant sources for tp • b stands for bound
  • 14. JOIN CARDINALITY ESTIMATION 14 M (B) is the average frequency of multivalued predicate or BGP. C(B) cardinality of B. j(s), j(o) means join based on subject resp. object of the triple pattern tp. ⋈ ⋈ 𝜋 B1=tp1 B2=tp2 B4=tp3 B3
  • 15. JOIN-COST ESTIMATION: HASH JOIN 15 Cost of receiving the highest cardinality BGP results, i.e. a triple pattern Cost of intersecting the results of both of the BGPs TC = 2 number of threads used, CSQ = 100 Cost of sending a SPARQL query CRT = 0.01 Cost of receiving a single result tuple. CHT = 0.0025 Cost of intersecting a single result with another result
  • 16. JOIN-COST ESTIMATION: BIND JOIN 16 CSQ = 100 Cost of sending a SPARQL query CRT = 0.01Cost of receiving a single result tuple. BSZ = 20 binding block size, CTC = 20 number of threads used Cost of receiving the smallest cardinality BGP results, i.e. a triple pattern Cost of binding and sending binded results as SPARQL queries
  • 18. EVALUATION SETUP  Benchmarks  FedBench (9 datasets)  LargeRDFBENCH (13 datasets)  Metrics  Index compression ratio (1-index size/total size)  Index generation time  Total number of triple-pattern wise sources selected  Number of ASK request used during the source selection  Source selection time  Query execution time 18  Federation engines  FedX  ANAPSID  SPLENDID  SemaGrow  HiBISCuS
  • 19. INDEX COMPRESSION RATIO  CostFed 99.99 %  HiBISCuS 99.99 %  SPLENDID 99.99 %  ANAPSID 99.99 %  SemaGrow 99.99 % 19
  • 20. INDEX GENERATION TIME  CostFed’s 60 min  HiBISCuS 41 min  SPLENDID 110 min  ANAPSID 5 min  SemaGrow 110 min 20
  • 22. EVALUATION RESULTS: NUMBER OF SOURCES SELECTED FedBench  CostFed 70  FedX 134  HiBISCuS 76  SPLENDID 134  ANAPSID 80 SemaGrow 134 22 LargeRDFBench  CostFed 104  FedX 199  HiBISCuS 106  SPLENDID 199  ANAPSID 111  SemaGrow 199
  • 23. EVALUATION RESULTS: NUMBER OF ASK REQUESTS FedBench  CostFed 45  FedX 549  HiBISCuS 45  SPLENDID 77  ANAPSID 89 SemaGrow 77 23 LargeRDFBench  CostFed 0  FedX 1196  HiBISCuS 0  SPLENDID 11  ANAPSID 103  SemaGrow 11
  • 24. EVALUATION RESULTS: AVG. SOURCE SELECTION TIME FedBench  CostFed 1.7 ms  FedX 3 ms (warm), 302 (cold)  HiBISCuS 137 ms  SPLENDID 46 ms  ANAPSID 463 ms SemaGrow 46 ms 24 LargeRDFBench  CostFed 1 ms  FedX 500 ms (warm), 4 (cold)  HiBISCuS 154.7 ms  SPLENDID 7.8 ms  ANAPSID 33.3 ms  SemaGrow 7.8 ms
  • 25. FEDBENCH QUERY RUNTIMES 25 1 10 100 1000 10000 100000 1000000 CD1 CD2 CD3 CD4 CD5 CD6 CD7 LS1 LS2 LS3 LS4 LS5 LS6 LS7 Avg. Runtimeinmsec(logscale) FedX SPLENDID ANAPSID SemaGrow CostFed • Ranked 1st in 11/14 queries • 3 times faster than SemaGrow • 17 times faster than FedX • 28 times faster than ANAPSID • 121 times faster than SPLENDID
  • 26. LARGERDFBENCH QUERY RUNTIMES 26 • Ranked 1st in 8/10 queries • 3 times faster than SemaGrow • 2 times faster than FedX • 1.20 times faster than ANAPSID • 1.73 times faster than SPLENDID • Missing bar indicates a 1 10 100 1000 10000 100000 1000000 10000000 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 Avg. Runtimeinmsec(logscale) FedX SPLENDID ANAPSID SemaGrow CostFed
  • 27. RESULTS ON LARGERDFBENCH (SPARQL1.1) 27 • Ranked 1st in 12/14 queries • 1.7 times faster than SemaGrow • 2.71 times faster than FedX • 7.34 times faster than ANAPSID • SPLENDID does not support SPARQL SERVICE clause
  • 28. RESULTS ON LARGERDFBENCH (SPARQL1.1) 28 • Ranked 1st in 6/9 queries • 1.19 times faster than SemaGrow • 1.13 times faster than FedX • 1.19 times faster than ANAPSID • SPLENDID does not support SPARQL SERVICE clause