2. Eventos Semantic Web Toolset
• One of them is “OntoQuad”:
a native RDF Database Management System Server
for Semantic Web
• We develop a wide range of semantic technology
instruments, a whole toolset, including:
– Documents Natural Language Processing
– Documents clusterization and classification
– Semantic Storages
– link discovery framework for the Web of Data
3. OntoQuad: native RDF Store Server for Semantic Web
OntoQuad is cross-platform and can be deployed
on different devices:
• MS Windows x64 (developed on Windows 7)
• Unix/Linux x64 (tested on Linux CentOS 6.3)
• Mobile Android (Samsung Galaxy Note II, Google
Nexus 7 etc.)
• Raspberry Pi Model B rev 2
• column-oriented storage, key-value index files implemented using B-trees
• developed with the latest C++ Standard (C++11)
• Supports triple (SPO) or quad (SPOC) configurations
• SPARQL 1.1 Query Language and Protocol, and Java (Jena) API
4. OntoQuad RDF Store - benchmarking
Berlin SPARQL Benchmark (BSBM): Electronic commerce scenarios
For comparison, we tested BSBM “Explore” Use case for:
– Virtuoso 6.1.6,
– Jena TDB (Fuseki 0.2.7) and
– BigData (Release 1.2.2).
All systems were configured to use 22 GB of main memory.
The benchmark machine:
– quad-core Intel i7-3770 CPU with 32 GB of RAM.
– storage is 2x2 TB 7200rpm SATA hard drives, configured as
software RAID 1.
5. BSBM tests. QMpH for 10 and 100 millions triples
Query Mix per Hour for 10 millions triples dataset
28,847
50,846
79,700
103,212
6,315
12,253
22,407
28,175
12,963
22,954
34,599
14,132
5,866
10,972
19,153
30,857
0
20,000
40,000
60,000
80,000
100,000
120,000
10m, 1 concurrent user 10m, 2 concurrent users 10m, 4 concurrent users 10m, 16 concurrent users
OntoQuad
Virtuoso
Jena TDB
BigData
8,605
15,814
27,009
31,454
5,270
10,270
18,983
22,163
2,466 3,578
5,839
2,8552,432
4,046
5,430 6,151
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
100m mt1 100m mt2 100m mt4 100m mt16
OntoQuad
Virtuoso
Jena TDB
BigData
Query Mix per Hour for 100 millions triples dataset
6. BSBM tests. QMpH for 10 millions datasets for Android
Query Mix per Hour for 10 millions triples dataset, Android vs. Linux Server
3177 Query Mix per Hour
Database size – 1,72 GB
Executable module size - 61 MB
Android Samsung Galaxy
Note II:
• 16 GB storage, 2 GB RAM
• Quad-core 1.6 GHz Cortex-A9
The benchmark Server (for
Virtuoso, Jena, BigData):
• quad-core Intel i7-3770 CPU with 32 GB of
RAM
• storage is 2x2 TB 7200rpm SATA hard
drives, configured as software RAID 1
• All systems were configured to use 22 GB of
main memory
3,177
6,315
12,963
5,866
0
2,000
4,000
6,000
8,000
10,000
12,000
14,000
10 millions, 1 concurrent user
OntoQuad Android
Virtuoso
Jena TDB
BigData
7. RIA LOD Datasets on basis of “OntoQuad”
RIA http://opendata.ria.ru/sparql, SPARQL examples
• Object types with instances numbers
select ?t ?o (count(?s) as ?number) WHERE { ?s a ?t. ?t ?p ?o.
FILTER(lang(?o) = "ru")
} group by ?t order by desc(?number)
• Object types with sameAs in LOD datasets
select ?l ?o WHERE {
?s <http://www.w3.org/2000/01/rdf-schema#label> ?l.
?s <http://www.w3.org/2002/07/owl#sameAs> ?o.} order by ?o
• Persons who live in the same city
prefix ria: <http://data.ria.ru#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
select ?city ?name {?loc a ria:Location. ?loc rdfs:label ?city. ?s a ria:Person. ?s
ria:birthPlace ?loc. ?s rdfs:label ?name. } order by ?city