SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
LDBC Semantic Publishing Benchmark
Evolution
March 2015
LDBC Semantic Publishing Benchmark Evolution #1Mar 2015
• Motivation
• The New Reference Datasets
• Changes in the generated metadata
• Ontologies and Required Rule-sets
• The Concrete Inference patterns
• Changes in the Basic Interactive Query Mix
• GraphDB Configuration Disclosure
• Future plans
Outline
#2LDBC Semantic Publishing Benchmark Evolution Mar 2015
SPB needed to evolve in order to:
• Allow for retrieval of semantically relevant content
– Based on rich metadata descriptions and diverse reference knowledge
• Demonstrate that triplestores offer simplified and
more efficient querying
And this way to:
• Cover more advanced and pertinent usage of
triplestores
• Present a bigger challenge to the engines
Change is Needed
#3LDBC Semantic Publishing Benchmark Evolution Mar 2015
• Background and issues with SPB 1.0:
– “Reference data” = master data in Dynamic Semantic Publishing
• Essentially, taxonomies and entity datasets used to describe “creative works”
– SPB 1.0 has very small set of reference data
– Reference data is not interconnected – few relations between entities
– All descriptions of content assets (articles, pictures, etc.) refer to the
Reference Data, but not to one another
– Star-shaped graph with small reference dataset in the middle
• This way SPB is not using the full potential of graph databases
• Bigger and better connected Reference Data
– More reference data and more connections between the entities
• A better connected dataset
– Make cross references between the different pieces of content
Directions for Improvement
#4LDBC Semantic Publishing Benchmark Evolution Mar 2015
• Make use of inference
– In SPB 1.0 it is really trivial: couple of subclasses and sub-properties
– It would be relevant to have some transitive, symmetric and inverse
properties
– Such semantics can foster retrieval of relevant content
• More interesting and challenging query mixes
– This can go in plenty of different directions, but the most obvious one
is to evolve the queries so that they take benefit from changes above
– On the bigger picture, there should be a way to make some of the
queries nicer, i.e. simpler and cleaner
• Testing high-availability
– FT acceptance tests for HA cluster are great starting point
– Too ambitious for SPB 2.0
Directions for Improvement (2)
#5LDBC Semantic Publishing Benchmark Evolution Mar 2015
• Motivation
• The New Reference Datasets
• Changes in the generated metadata
• Ontologies and Required Rule-sets
• The Concrete Inference patterns
• Changes in the Basic Interactive Query Mix
• GraphDB Configuration Disclosure
• Future plans
Outline
#6LDBC Semantic Publishing Benchmark Evolution Mar 2015
• 22M explicit statements total in SPB v.2.0
– vs. 127k statements in SPB 1.0; BBC lists + tiny fractions of GeoNames
• Added reference data from DBPedia 2014
– Extracts alike: DESCRIBE ?e WHERE { ?e a dbp-ont:Company }
– Companies (85 000 entities)
– Events (50 000 entities)
– Persons (1M entities)
• Geonames data for Europe
– All European countries, w/o RUS, UKR, BLR; 650 000 locations total
– Some properties and locations types irrelevant to SPB excluded
• owl:sameAs links between DBpedia and Geonames
– 500k owl:sameAs mappings
Extended Reference Dataset
#7LDBC Semantic Publishing Benchmark Evolution Mar 2015
To
Company
To Person To Place /
gn:Feature
To Event
Company 40,797 26,675 218,636 18
Person 89,506 1,324,425 3,380,145 145,892
Event 5,114 154,207 140,579 35,442
Interconnected Reference Dataset
#8LDBC Semantic Publishing Benchmark Evolution
• Substantial volume of connections between entities
• Geonames comes with hierarchical relationship,
defining nesting of locations, gn:parentFeature
• DBPedia inter-entity relationships between entities*:
* numbers shown in table are approximate
Mar 2015
• GraphDB’s RDFRank feature is used to calculate a
rank for all the entities in the Reference Data
– RDFRank calculates a measure of importance for all URIs in an RDF
graph, based on Google’s PageRank algorithm
• These RDFRanks are calculated using GraphDB after
the entire Reference Data is loaded
– This way all sorts of relationships in the graph are considered during
calculations, including such that were logically inferred
• RDFRanks are inserted using predicate
<http://www.ldbcouncil.org/spb#hasRDFRank>
• These ranks are available as a part of the Ref. Data
– No need to get compute them again during loading
RDF-Rank for the Reference Data
#9LDBC Semantic Publishing Benchmark Evolution Mar 2015
• The Data Generator uses a set of “popular entities”
– Those are referred in 30% of the content-to-entity relations/tags
– This is one of the heuristics used by the data generator to produce
more realistic data distributions
– This is implemented in SPB 1.0, no change in SPB 2.0
• Popular entities are those with top 5% RDF Rank
– This is the new thing in SPB v.2.0
– Before that the popular entities were selected randomly
– This way, we get a more realistic dataset where those entities which
are more often used to tag content also have better connectivity in the
Reference Data
• In the future, RDF-ranks can be used for other
purposes also
RDF-Rank Used for Popular Entities
#10LDBC Semantic Publishing Benchmark Evolution Mar 2015
• Motivation
• The New Reference Datasets
• Changes in the generated metadata
• Ontologies and Required Rule-sets
• The Concrete Inference patterns
• Changes in the Basic Interactive Query Mix
• GraphDB Configuration Disclosure
• Future plans
Outline
#11LDBC Semantic Publishing Benchmark Evolution Mar 2015
• Generated three times more relationships between
Creative Works and Entities, than in SPB 1.0
– More recent use cases in publishing, adopt rich metadata descriptions
with more than 10 references to relevant entities and concepts
– In SPB 1.0 there are on average a bit less than 3 entity references per
creative work, based on distributions from an old BBC archive
– While developing SPB 2.0 we didn’t have access to substantial amount
of rich semantic metadata from publisher’s archive, to allow us derive
content-to-concept frequency distributions
– So, we decided to use the same distributions from BBC, but to triple
the concept references for each of them
– In SPB 2.0 there are on average about 8 content-to-concept references
– This results in about 25 statements/creative work description
– All other heuristics and specifics of the metadata generator were
preserved (popular entities, CW sequences alike storylines, etc.)
Changes in the Metadata
#12LDBC Semantic Publishing Benchmark Evolution Mar 2015
• Geonames URIs used for content-to-location
references
– As in SPB v.1.0, each creative work description refers (through
cwork:Mention property) to exactly one location
• These references are exploited in queries with geo-spatial constraints
– While most of the geo-spatial information in the Reference Data
comes from Geonames, for about 500 thousand locations there are
also DBPedia URIs, that come from the DBPedia-to-Geonames
owl:sameAs mappings
– Intentionally, DBPedia URIs are not used when metadata about
creative works is generated
• In contrast, the substitution parameters for locations in the corresponding queries
use only DBpedia locations
• This way the corresponding queries would return no results if proper SameAs
expansion doesn’t exist
Changes in the Metadata (2)
#13LDBC Semantic Publishing Benchmark Evolution Mar 2015
• As a result of changes in the size of the Reference
Data and the metadata size, there are differences in
the number of Creative Works included at the same
scale factor in SPB v.1.0 and in SPB v.2.0
Changes in the Metadata - Stats
#14LDBC Semantic Publishing Benchmark Evolution Mar 2015
SPB 1.0 SPB 2.0
Reference Data size (explicit statements) 170k 22M
Creative Work descr. size (explicit st./CW) 19 25
Metadata in 50M dataset (explicit statements) 50M 28M
Creative Works in 50M datasets (count) 2.6M 1.1M
Creative Work-to-Entity relationships in 50M 7M 9M
Metadata in 1B dataset (explicit statements) 1B 978M
Creative Works in 1B datasets (count) 53M 39M
Creative Work-to-Entity relationships in 1B 137M 313M
• Motivation
• The New Reference Datasets
• Changes in the generated metadata
• Ontologies and Required Rule-sets
• The Concrete Inference patterns
• Changes in the Basic Interactive Query Mix
• GraphDB Configuration Disclosure
• Future plans
Outline
#15LDBC Semantic Publishing Benchmark Evolution Mar 2015
The Rules Sets
LDBC Semantic Publishing Benchmark Evolution #16
• Complete inference in SPB 2.0 requires support for
the following primitives:
– RDFS: subPropertyOf, subClassOf
– OWL: TransitiveProperty, SymmetricProperty, sameAs
• Any triplestore with OWL 2 RL reasoning support will
be able to process SPB v2.0 correctly
– In fact, a much reduced rule-set is sufficient for complete reasoning in
as in OWL 2 RL there are host of primitives and rules that SPB 2.0 does
not make use of. Such examples are all onProperty restrictions, class
and property equivalence, property chains, etc.
– The popular (but not W3C standardized) OWL Horst profile is sufficient
for reasoning with SPB v 2.0
• OWL Horst refers to the pD* entailment defined by Herman Ter Horst in:
Completeness, decidability and complexity of entailment for RDF Schema and a
semantic extension involving the OWL vocabulary. J. Web Sem. 3(2-3): 79-115 (2005)
Mar 2015
• Modified versions of the BBC Ontologies
– As in SPB v.1.0, BBC Core ontology defines relationships between
Person – Organizations – Locations - Events
– Those were mapped to corresponding DBPedia and Geonames classes
and relationships
• bbccore:Thing is defined as super class of dbp-ont:Company, dbp-ont:Event,
foaf:Person, geonames-ontology:Feature (geographic feature)
– dbp-ont:parentCompany is defined to be owl:TransitiveProperty
– These extra definitions are added at the end of the BBC Core ontology
• Added a SPB-modified version of the Geonames
ontology
– Stripped out unnecessary pieces such onProperty restrictions that
impose cardinality constraints
New and Updated Ontologies
LDBC Semantic Publishing Benchmark Evolution #17Mar 2015
• Motivation
• The New Reference Datasets
• Changes in the generated metadata
• Ontologies and Required Rule-sets
• The Concrete Inference patterns
• Changes in the Basic Interactive Query Mix
• GraphDB Configuration Disclosure
• Future plans
Outline
#18LDBC Semantic Publishing Benchmark Evolution Mar 2015
• Transitive closure of location-nesting relationships
– geo-ont:parentFeature property is used in GeoNames to link each
location to larger locations that it is a part of
– geo-ont:parentFeature is owl:TransitiveProperty in the GN ontology
– This way if Munich has gn:parentFeature Bavaria, that on its turn has
gn:parentFeature Germany, than reasoning should make sure that
Germany also appears as gn:parentFeature of Munich
• Transitive closure of dbp-ont:ParentCompany
– Inference unveils indirect company control relationships
• owl:sameAs relations between Geonames features
and DBPedia URIs for the same locations:
– The standard semantics of owl:sameAs requires that each statement
asserted using the Geonames URIs should be inferable/retrievable also
with the mapped DBPedia URIs
Concrete Inference Patterns
LDBC Semantic Publishing Benchmark Evolution #19Mar 2015
• The inference patterns from SPB v.1.0 are still there:
– cwork:tag statements inferred from each of its sub-properties
cwork:about and cwork:mentions
– < ?cw rdf:type cwork:CreativeWork > statements inferred for
instances of each of its sub-classes
– These two are the only reasoning patterns that apply to the generated
content metadata; all the others apply only to the Reference Data
Concrete Inference Patterns (2)
LDBC Semantic Publishing Benchmark Evolution #20Mar 2015
• If brute-force materialization is used, the Reference
Data expands from 22M statements to about 100M
– If owl:sameAs expansion is not considered, materialization on top of
the Reference Data adds about 7M statements – most of those coming
from the transitive closure of geo-ont:parentFeature
– Several triplestores (e.g. ORACLE and GraphDB) that use forward-
chaining have specific mechanism, which allow them to handle
owl:sameAs reasoning in a sort of hybrid manner, without expanding
their indices. After materialization, such engines will have to deal with
29M statements of Reference Data, instead of 100M
• Brute-force materialization of the generated
metadata describing creative works would double it
– In SPB v.1.0 the expansion factor was 1.6, but now there are more
entity references and also owl:sameAs equivalents of the location URIs
Inference Statistics and Ratios
LDBC Semantic Publishing Benchmark Evolution #21Mar 2015
• Motivation
• The New Reference Datasets
• Changes in the generated metadata
• Ontologies and Required Rule-sets
• The Concrete Inference patterns
• Changes in the Basic Interactive Query Mix
• GraphDB Configuration Disclosure
• Future plans
Outline
#22LDBC Semantic Publishing Benchmark Evolution Mar 2015
Modified Queries in Basic Interactive
LDBC Semantic Publishing Benchmark Evolution #23
• SPB 2.0 changes only the Basic Interactive Query-Mix
– The other query mixes are mostly unchanged
• In most of the queries, cwork:about and
cwork:mention properties that link creative work to
an entity, have been replaced with their supper
cwork:tag
– This applies also to the Advanced Interactive and the Analytical use
cases
– For most of the queries, both types of relations are equally relevant
– Using the super-property make the query simpler; as compared to
other approaches to make a query that uses both (e.g. UNION)
Mar 2015
New Queries
LDBC Semantic Publishing Benchmark Evolution #24
• Basic Interactive query-mix now adds 2 new queries
exploring the interconnectedness in reference data
• Q10: Retrieve CWs that mention locations in the
same province (A.ADM1) as the specified one
– There is additional constraint on time interval (5 days)
• Q11: Retrieve the most recent CWs that are tagged
with entities, related to a specific popular entity
– Relations can be inbound and outbound; explicit or inferred
Mar 2015
Q10: News from the region
LDBC Semantic Publishing Benchmark Evolution #25
SELECT ?cw ?title ?dateModified {
<http://dbpedia.org/resource/Sofia> geo-ont:parentFeature ?province .
?province geo-ont:featureCode geo-ont:A.ADM1 .
{
?location geo-ont:parentFeature ?province .
} UNION {
BIND(?province as ?location) .
}
?cw a cwork:CreativeWork ;
cwork:tag ?location ;
cwork:title ?title ;
cwork:dateModified ?dateModified .
FILTER(?dateModified >=
"2011-05-14T00:00:00.000"^^<http://www.w3.org/2001/XMLSchema#dateTime>
&& ?dateModified <
"2011-05-19T23:59:59.999"^^<http://www.w3.org/2001/XMLSchema#dateTime>)
}
LIMIT 100
Mar 2015
Q11: News on about related entities
LDBC Semantic Publishing Benchmark Evolution #26
SELECT DISTINCT ?cw ?title ?description ?dateModified ?primaryContent
{
{
<http://dbpedia.org/resource/Teresa_Fedor> ?p ?e .
}
UNION
{
?e ?p <http://dbpedia.org/resource/Teresa_Fedor> .
}
?e a core:Thing .
?cw cwork:tag ?e ;
cwork:title ?title ;
cwork:description ?description ;
cwork:dateModified ?dateModified ;
bbc:primaryContentOf ?primaryContent .
}
ORDER BY DESC(?dateModified)
LIMIT 100
Mar 2015
Trivial Validation Opportunities
LDBC Semantic Publishing Benchmark Evolution #27
• The validation mechanisms from SPB v.1.0 remain
unchanged
• In addition to this, the changes to the benchmark are
designed such a way that in case that specific
inference patterns are not supported by the engine,
some of the queries will return zero results
– If owl:sameAs inference is not supported Q10 will return 0 results
– If transitive properties are not supported, Q10 will return 0 results
• Q11 will return smaller number of results (higher ratio of zeros also)
– If rdfs:subProperty is not supported, Q2-Q10 will return 0 results
– If rdfs:subClass is not supported, Q1, Q2, Q6, Q8 will return no results
Mar 2015
• Motivation
• The New Reference Datasets
• Changes in the generated metadata
• Ontologies and Required Rule-sets
• The Concrete Inference patterns
• The New Queries
• GraphDB Configuration Disclosure
• Future plans
Outline
#28LDBC Semantic Publishing Benchmark Evolution Mar 2015
1. Reasoning performed through forward-chaining
with a custom ruleset
– This is the rdfs-optimized ruleset of GraphDB with added rules to
support transitive, inverse and symmetric properties
2. owl:sameAs optimization of GraphDB is beneficial
– This is the standard behavior of GraphDB; but it helps a lot
– Query-time tuning is used to disable expansion of results with
respect to owl:sameAs equivalent URIs
3. Geo-spatial index in Q6 of the basic interactive mix
4. Lucene Connector for full-text search in Q8
Note: Points #2-#4 above help query time, but slow
down updates in GraphDB. As materialization does also
GraphDB Configuration
LDBC Semantic Publishing Benchmark Evolution #29Mar 2015
• Experimental run using GraphDB-SE 6.1
• Using LDBC Scale Factor 1 - (22M reference data +
30M generated data)
• Hardware: 32G RAM, Intel i7-4770 CPU @ 3.40GHz,
single SSD Drive Samsung 845 DC
• Benchmark configuration: 8 reading / 2 writing
agents, 1 min warmup, 40 min benchmark
• Updates/sec : 13, Selects /sec : 44
GraphDB – First Results
LDBC Semantic Publishing Benchmark Evolution #30Mar 2015
• Motivation
• The New Reference Datasets
• Changes in the generated metadata
• Ontologies and Required Rule-sets
• The Concrete Inference patterns
• Changes in the Basic Interactive Query Mix
• GraphDB Configuration Disclosure
• Future plans
Outline
#31LDBC Semantic Publishing Benchmark Evolution Mar 2015
• Relationships between pieces of content
– At present Creative Works are related only to Entities, not to other
CWs
– These could be StoryLines or Collections of MM assets, e.g. an article
with few images related to it
• Better modelling of content-to-entity cardinalities
– Based on data from FT
• Enriched query sets and realistic query frequencies
– Based on query logs from FT
• Loading/updating big dataset in live instances
• Testing high-availability
– FT acceptance tests for High Availability cluster are great starting point
Future Plans
#32LDBC Semantic Publishing Benchmark Evolution Mar 2015
• Much bigger reference dataset: from 170k to 22M
– 7M statement from GeoNames about Europe
– 14M statements from DBpedia for Companies, Persons, Events
• Interconnected reference data: more than 5M links
– owl:sameAs links between DBPedia and Geonames
• Much more comprehensive usage of inference
– While still the simplest possible flavor of OWL is used
• rdfs:subClassOf, rdfs:subPorpertyOf, owl:TransitiveProperty, owl:sameAs
– Transitive closure over company control and geographic nesting
– Simpler queries through usage of super-property
• Retrieval of relevant content through links in the
reference data
Summary of SPB v.2.0 Changes
#33LDBC Semantic Publishing Benchmark Evolution Mar 2015

Contenu connexe

Tendances

19 09-26 source reconfiguration and august release features
19 09-26 source reconfiguration and august release features19 09-26 source reconfiguration and august release features
19 09-26 source reconfiguration and august release featuresDede Sabey
 
Semantika Introduction
Semantika IntroductionSemantika Introduction
Semantika IntroductionJosef Hardi
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Databricks
 
Search Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSearch Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveAndrea Gazzarini
 
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...Databricks
 

Tendances (8)

19 09-26 source reconfiguration and august release features
19 09-26 source reconfiguration and august release features19 09-26 source reconfiguration and august release features
19 09-26 source reconfiguration and august release features
 
Census GSS Initiative (EPAN 2011)
Census GSS Initiative (EPAN 2011)Census GSS Initiative (EPAN 2011)
Census GSS Initiative (EPAN 2011)
 
Dev Analytics Overview
Dev Analytics OverviewDev Analytics Overview
Dev Analytics Overview
 
Semantika Introduction
Semantika IntroductionSemantika Introduction
Semantika Introduction
 
R training at Aimia
R training at AimiaR training at Aimia
R training at Aimia
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
 
Search Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSearch Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer Perspective
 
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spa...
 

En vedette

Quick Look - Mobile Banking UX Benchmark Study | UserZoom
Quick Look - Mobile Banking UX Benchmark Study | UserZoomQuick Look - Mobile Banking UX Benchmark Study | UserZoom
Quick Look - Mobile Banking UX Benchmark Study | UserZoomUserZoom
 
OntoFrame기반 시맨틱 서비스와 서비스 매쉬업
OntoFrame기반 시맨틱 서비스와 서비스 매쉬업OntoFrame기반 시맨틱 서비스와 서비스 매쉬업
OntoFrame기반 시맨틱 서비스와 서비스 매쉬업webscikorea
 
Semantic Web and Linked Data - In Action
Semantic Web and Linked Data - In ActionSemantic Web and Linked Data - In Action
Semantic Web and Linked Data - In ActionRichard Wallis
 
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...Lora Aroyo
 
SEO Meets Semantic Web - Saint Patrick's Day 2015-Meetup
SEO Meets Semantic Web - Saint Patrick's Day 2015-Meetup SEO Meets Semantic Web - Saint Patrick's Day 2015-Meetup
SEO Meets Semantic Web - Saint Patrick's Day 2015-Meetup Eric Franzon
 
CIB W78 2015 - Semantic Rule-checking for Regulation Compliance Checking
CIB W78 2015 - Semantic Rule-checking for Regulation Compliance CheckingCIB W78 2015 - Semantic Rule-checking for Regulation Compliance Checking
CIB W78 2015 - Semantic Rule-checking for Regulation Compliance CheckingPieter Pauwels
 
Virtual Research Environments as-a-serive
Virtual Research Environments as-a-seriveVirtual Research Environments as-a-serive
Virtual Research Environments as-a-seriveBlue BRIDGE
 
How and When to Switch to Structured Content - Workshop
How and When to Switch to Structured Content - WorkshopHow and When to Switch to Structured Content - Workshop
How and When to Switch to Structured Content - WorkshopIXIASOFT
 
Quick Look - Auto Insurance UX Benchmark Study | UserZoom
Quick Look - Auto Insurance UX Benchmark Study | UserZoomQuick Look - Auto Insurance UX Benchmark Study | UserZoom
Quick Look - Auto Insurance UX Benchmark Study | UserZoomUserZoom
 
How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)Rainer Sternfeld
 
Semantic Web Intro - St. Patrick's Day 2016 Update
Semantic Web Intro - St. Patrick's Day 2016 UpdateSemantic Web Intro - St. Patrick's Day 2016 Update
Semantic Web Intro - St. Patrick's Day 2016 UpdateEric Franzon
 
The Semantic Web and Book Publishing - Supply Chain Seminar, London Book Fair...
The Semantic Web and Book Publishing - Supply Chain Seminar, London Book Fair...The Semantic Web and Book Publishing - Supply Chain Seminar, London Book Fair...
The Semantic Web and Book Publishing - Supply Chain Seminar, London Book Fair...Publishing Technology
 
CrowdTruth - ESWC PhD Symposium
CrowdTruth - ESWC PhD SymposiumCrowdTruth - ESWC PhD Symposium
CrowdTruth - ESWC PhD SymposiumAnca Dumitrache
 
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...DeVonne Parks, CEM
 
Interlinking semantics, web2.0, and the real-world
Interlinking semantics, web2.0, and the real-worldInterlinking semantics, web2.0, and the real-world
Interlinking semantics, web2.0, and the real-worldThe Open University
 
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of InterestWherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of InterestWhereCampBerlin
 
It's a Millennial World. ABA Marketing Conference 2016
It's a Millennial World.  ABA Marketing Conference 2016It's a Millennial World.  ABA Marketing Conference 2016
It's a Millennial World. ABA Marketing Conference 2016Kathleen Craig
 
VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4Davide Ceolin
 

En vedette (20)

Quick Look - Mobile Banking UX Benchmark Study | UserZoom
Quick Look - Mobile Banking UX Benchmark Study | UserZoomQuick Look - Mobile Banking UX Benchmark Study | UserZoom
Quick Look - Mobile Banking UX Benchmark Study | UserZoom
 
OntoFrame기반 시맨틱 서비스와 서비스 매쉬업
OntoFrame기반 시맨틱 서비스와 서비스 매쉬업OntoFrame기반 시맨틱 서비스와 서비스 매쉬업
OntoFrame기반 시맨틱 서비스와 서비스 매쉬업
 
Semantic Web and Linked Data - In Action
Semantic Web and Linked Data - In ActionSemantic Web and Linked Data - In Action
Semantic Web and Linked Data - In Action
 
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
 
SEO Meets Semantic Web - Saint Patrick's Day 2015-Meetup
SEO Meets Semantic Web - Saint Patrick's Day 2015-Meetup SEO Meets Semantic Web - Saint Patrick's Day 2015-Meetup
SEO Meets Semantic Web - Saint Patrick's Day 2015-Meetup
 
CIB W78 2015 - Semantic Rule-checking for Regulation Compliance Checking
CIB W78 2015 - Semantic Rule-checking for Regulation Compliance CheckingCIB W78 2015 - Semantic Rule-checking for Regulation Compliance Checking
CIB W78 2015 - Semantic Rule-checking for Regulation Compliance Checking
 
Virtual Research Environments as-a-serive
Virtual Research Environments as-a-seriveVirtual Research Environments as-a-serive
Virtual Research Environments as-a-serive
 
How and When to Switch to Structured Content - Workshop
How and When to Switch to Structured Content - WorkshopHow and When to Switch to Structured Content - Workshop
How and When to Switch to Structured Content - Workshop
 
Quick Look - Auto Insurance UX Benchmark Study | UserZoom
Quick Look - Auto Insurance UX Benchmark Study | UserZoomQuick Look - Auto Insurance UX Benchmark Study | UserZoom
Quick Look - Auto Insurance UX Benchmark Study | UserZoom
 
How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)How to Create the Google for Earth Data (XLDB 2015, Stanford)
How to Create the Google for Earth Data (XLDB 2015, Stanford)
 
Semantic Web Intro - St. Patrick's Day 2016 Update
Semantic Web Intro - St. Patrick's Day 2016 UpdateSemantic Web Intro - St. Patrick's Day 2016 Update
Semantic Web Intro - St. Patrick's Day 2016 Update
 
The Semantic Web and Book Publishing - Supply Chain Seminar, London Book Fair...
The Semantic Web and Book Publishing - Supply Chain Seminar, London Book Fair...The Semantic Web and Book Publishing - Supply Chain Seminar, London Book Fair...
The Semantic Web and Book Publishing - Supply Chain Seminar, London Book Fair...
 
CrowdTruth - ESWC PhD Symposium
CrowdTruth - ESWC PhD SymposiumCrowdTruth - ESWC PhD Symposium
CrowdTruth - ESWC PhD Symposium
 
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
 
The Web and the Mind
The Web and the MindThe Web and the Mind
The Web and the Mind
 
Interlinking semantics, web2.0, and the real-world
Interlinking semantics, web2.0, and the real-worldInterlinking semantics, web2.0, and the real-world
Interlinking semantics, web2.0, and the real-world
 
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of InterestWherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
Wherecamp Navigation Conference 2015 - SPOI SDI4pps: Points of Interest
 
It's a Millennial World. ABA Marketing Conference 2016
It's a Millennial World.  ABA Marketing Conference 2016It's a Millennial World.  ABA Marketing Conference 2016
It's a Millennial World. ABA Marketing Conference 2016
 
Semantic web at Novartis
Semantic web at NovartisSemantic web at Novartis
Semantic web at Novartis
 
VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4VU University Amsterdam - The Social Web 2016 - Lecture 4
VU University Amsterdam - The Social Web 2016 - Lecture 4
 

Similaire à LDBC Semantic Publishing Benchmark 2.0 evolution - Ontotext

SPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingSPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingKristian Alexander
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
 
The Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are ImportantThe Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are ImportantChimezie Ogbuji
 
Selecting the right database type for your knowledge management needs.
Selecting the right database type for your knowledge management needs.Selecting the right database type for your knowledge management needs.
Selecting the right database type for your knowledge management needs.Synaptica, LLC
 
Intro to the semantic web (for libraries)
Intro to the semantic web (for libraries) Intro to the semantic web (for libraries)
Intro to the semantic web (for libraries) robin fay
 
Practical approaches to entification in library bibliographic data
Practical approaches to entification in library bibliographic dataPractical approaches to entification in library bibliographic data
Practical approaches to entification in library bibliographic dataTerry Reese
 
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...Holistic Benchmarking of Big Linked Data
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4Marin Dimitrov
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
Data Processing over very Large Relational Databases
Data Processing over very Large Relational DatabasesData Processing over very Large Relational Databases
Data Processing over very Large Relational Databaseskvaderlipa
 
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNext
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNextAALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNext
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNextTerry Reese
 
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...Thanh Tran
 
SharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 SearchSharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 SearchC/D/H Technology Consultants
 
Whowas: History of resources at APNIC
Whowas: History of resources at APNICWhowas: History of resources at APNIC
Whowas: History of resources at APNICAPNIC
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentPeter Haase
 
MongoDB - General Purpose Database
MongoDB - General Purpose DatabaseMongoDB - General Purpose Database
MongoDB - General Purpose DatabaseAshnikbiz
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataGiorgos Santipantakis
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of librariesRegan Harper
 

Similaire à LDBC Semantic Publishing Benchmark 2.0 evolution - Ontotext (20)

SPARQL and Linked Data Benchmarking
SPARQL and Linked Data BenchmarkingSPARQL and Linked Data Benchmarking
SPARQL and Linked Data Benchmarking
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
The Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are ImportantThe Characteristics of a RESTful Semantic Web and Why They Are Important
The Characteristics of a RESTful Semantic Web and Why They Are Important
 
Selecting the right database type for your knowledge management needs.
Selecting the right database type for your knowledge management needs.Selecting the right database type for your knowledge management needs.
Selecting the right database type for your knowledge management needs.
 
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge BasesLOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
LOD2: State of Play WP2 - Storing and Querying Very Large Knowledge Bases
 
Intro to the semantic web (for libraries)
Intro to the semantic web (for libraries) Intro to the semantic web (for libraries)
Intro to the semantic web (for libraries)
 
Practical approaches to entification in library bibliographic data
Practical approaches to entification in library bibliographic dataPractical approaches to entification in library bibliographic data
Practical approaches to entification in library bibliographic data
 
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
An RDF Dataset Generator for the Social Network Benchmark with Real-World Coh...
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Data Processing over very Large Relational Databases
Data Processing over very Large Relational DatabasesData Processing over very Large Relational Databases
Data Processing over very Large Relational Databases
 
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNext
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNextAALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNext
AALL 2015: Hands on Linked Data Tools for Catalogers: MarcEdit and MARCNext
 
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
 
Dwh faqs
Dwh faqsDwh faqs
Dwh faqs
 
SharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 SearchSharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 Search
 
Whowas: History of resources at APNIC
Whowas: History of resources at APNICWhowas: History of resources at APNIC
Whowas: History of resources at APNIC
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
 
MongoDB - General Purpose Database
MongoDB - General Purpose DatabaseMongoDB - General Purpose Database
MongoDB - General Purpose Database
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of libraries
 

Plus de LDBC council

8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...LDBC council
 
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...LDBC council
 
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics EngineLDBC council
 
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...LDBC council
 
8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark
8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark
8th TUC Meeting – Marcus Paradies (SAP) Social Network BenchmarkLDBC council
 
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...LDBC council
 
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...LDBC council
 
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force statusLDBC council
 
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...LDBC council
 
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...LDBC council
 
8th TUC Meeting -
8th TUC Meeting - 8th TUC Meeting -
8th TUC Meeting - LDBC council
 
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...LDBC council
 
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...LDBC council
 
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...LDBC council
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC council
 
LDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusionsLDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusionsLDBC council
 
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXParallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXLDBC council
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionLDBC council
 
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...LDBC council
 
LDBC SNB Benchmark Auditing
LDBC SNB Benchmark AuditingLDBC SNB Benchmark Auditing
LDBC SNB Benchmark AuditingLDBC council
 

Plus de LDBC council (20)

8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
 
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
 
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
 
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
 
8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark
8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark
8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark
 
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
 
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
 
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
 
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
 
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF ...
 
8th TUC Meeting -
8th TUC Meeting - 8th TUC Meeting -
8th TUC Meeting -
 
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
 
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
 
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
 
LDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusionsLDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusions
 
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXParallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service Selection
 
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
 
LDBC SNB Benchmark Auditing
LDBC SNB Benchmark AuditingLDBC SNB Benchmark Auditing
LDBC SNB Benchmark Auditing
 

Dernier

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Dernier (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

LDBC Semantic Publishing Benchmark 2.0 evolution - Ontotext

  • 1. LDBC Semantic Publishing Benchmark Evolution March 2015 LDBC Semantic Publishing Benchmark Evolution #1Mar 2015
  • 2. • Motivation • The New Reference Datasets • Changes in the generated metadata • Ontologies and Required Rule-sets • The Concrete Inference patterns • Changes in the Basic Interactive Query Mix • GraphDB Configuration Disclosure • Future plans Outline #2LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 3. SPB needed to evolve in order to: • Allow for retrieval of semantically relevant content – Based on rich metadata descriptions and diverse reference knowledge • Demonstrate that triplestores offer simplified and more efficient querying And this way to: • Cover more advanced and pertinent usage of triplestores • Present a bigger challenge to the engines Change is Needed #3LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 4. • Background and issues with SPB 1.0: – “Reference data” = master data in Dynamic Semantic Publishing • Essentially, taxonomies and entity datasets used to describe “creative works” – SPB 1.0 has very small set of reference data – Reference data is not interconnected – few relations between entities – All descriptions of content assets (articles, pictures, etc.) refer to the Reference Data, but not to one another – Star-shaped graph with small reference dataset in the middle • This way SPB is not using the full potential of graph databases • Bigger and better connected Reference Data – More reference data and more connections between the entities • A better connected dataset – Make cross references between the different pieces of content Directions for Improvement #4LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 5. • Make use of inference – In SPB 1.0 it is really trivial: couple of subclasses and sub-properties – It would be relevant to have some transitive, symmetric and inverse properties – Such semantics can foster retrieval of relevant content • More interesting and challenging query mixes – This can go in plenty of different directions, but the most obvious one is to evolve the queries so that they take benefit from changes above – On the bigger picture, there should be a way to make some of the queries nicer, i.e. simpler and cleaner • Testing high-availability – FT acceptance tests for HA cluster are great starting point – Too ambitious for SPB 2.0 Directions for Improvement (2) #5LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 6. • Motivation • The New Reference Datasets • Changes in the generated metadata • Ontologies and Required Rule-sets • The Concrete Inference patterns • Changes in the Basic Interactive Query Mix • GraphDB Configuration Disclosure • Future plans Outline #6LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 7. • 22M explicit statements total in SPB v.2.0 – vs. 127k statements in SPB 1.0; BBC lists + tiny fractions of GeoNames • Added reference data from DBPedia 2014 – Extracts alike: DESCRIBE ?e WHERE { ?e a dbp-ont:Company } – Companies (85 000 entities) – Events (50 000 entities) – Persons (1M entities) • Geonames data for Europe – All European countries, w/o RUS, UKR, BLR; 650 000 locations total – Some properties and locations types irrelevant to SPB excluded • owl:sameAs links between DBpedia and Geonames – 500k owl:sameAs mappings Extended Reference Dataset #7LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 8. To Company To Person To Place / gn:Feature To Event Company 40,797 26,675 218,636 18 Person 89,506 1,324,425 3,380,145 145,892 Event 5,114 154,207 140,579 35,442 Interconnected Reference Dataset #8LDBC Semantic Publishing Benchmark Evolution • Substantial volume of connections between entities • Geonames comes with hierarchical relationship, defining nesting of locations, gn:parentFeature • DBPedia inter-entity relationships between entities*: * numbers shown in table are approximate Mar 2015
  • 9. • GraphDB’s RDFRank feature is used to calculate a rank for all the entities in the Reference Data – RDFRank calculates a measure of importance for all URIs in an RDF graph, based on Google’s PageRank algorithm • These RDFRanks are calculated using GraphDB after the entire Reference Data is loaded – This way all sorts of relationships in the graph are considered during calculations, including such that were logically inferred • RDFRanks are inserted using predicate <http://www.ldbcouncil.org/spb#hasRDFRank> • These ranks are available as a part of the Ref. Data – No need to get compute them again during loading RDF-Rank for the Reference Data #9LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 10. • The Data Generator uses a set of “popular entities” – Those are referred in 30% of the content-to-entity relations/tags – This is one of the heuristics used by the data generator to produce more realistic data distributions – This is implemented in SPB 1.0, no change in SPB 2.0 • Popular entities are those with top 5% RDF Rank – This is the new thing in SPB v.2.0 – Before that the popular entities were selected randomly – This way, we get a more realistic dataset where those entities which are more often used to tag content also have better connectivity in the Reference Data • In the future, RDF-ranks can be used for other purposes also RDF-Rank Used for Popular Entities #10LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 11. • Motivation • The New Reference Datasets • Changes in the generated metadata • Ontologies and Required Rule-sets • The Concrete Inference patterns • Changes in the Basic Interactive Query Mix • GraphDB Configuration Disclosure • Future plans Outline #11LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 12. • Generated three times more relationships between Creative Works and Entities, than in SPB 1.0 – More recent use cases in publishing, adopt rich metadata descriptions with more than 10 references to relevant entities and concepts – In SPB 1.0 there are on average a bit less than 3 entity references per creative work, based on distributions from an old BBC archive – While developing SPB 2.0 we didn’t have access to substantial amount of rich semantic metadata from publisher’s archive, to allow us derive content-to-concept frequency distributions – So, we decided to use the same distributions from BBC, but to triple the concept references for each of them – In SPB 2.0 there are on average about 8 content-to-concept references – This results in about 25 statements/creative work description – All other heuristics and specifics of the metadata generator were preserved (popular entities, CW sequences alike storylines, etc.) Changes in the Metadata #12LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 13. • Geonames URIs used for content-to-location references – As in SPB v.1.0, each creative work description refers (through cwork:Mention property) to exactly one location • These references are exploited in queries with geo-spatial constraints – While most of the geo-spatial information in the Reference Data comes from Geonames, for about 500 thousand locations there are also DBPedia URIs, that come from the DBPedia-to-Geonames owl:sameAs mappings – Intentionally, DBPedia URIs are not used when metadata about creative works is generated • In contrast, the substitution parameters for locations in the corresponding queries use only DBpedia locations • This way the corresponding queries would return no results if proper SameAs expansion doesn’t exist Changes in the Metadata (2) #13LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 14. • As a result of changes in the size of the Reference Data and the metadata size, there are differences in the number of Creative Works included at the same scale factor in SPB v.1.0 and in SPB v.2.0 Changes in the Metadata - Stats #14LDBC Semantic Publishing Benchmark Evolution Mar 2015 SPB 1.0 SPB 2.0 Reference Data size (explicit statements) 170k 22M Creative Work descr. size (explicit st./CW) 19 25 Metadata in 50M dataset (explicit statements) 50M 28M Creative Works in 50M datasets (count) 2.6M 1.1M Creative Work-to-Entity relationships in 50M 7M 9M Metadata in 1B dataset (explicit statements) 1B 978M Creative Works in 1B datasets (count) 53M 39M Creative Work-to-Entity relationships in 1B 137M 313M
  • 15. • Motivation • The New Reference Datasets • Changes in the generated metadata • Ontologies and Required Rule-sets • The Concrete Inference patterns • Changes in the Basic Interactive Query Mix • GraphDB Configuration Disclosure • Future plans Outline #15LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 16. The Rules Sets LDBC Semantic Publishing Benchmark Evolution #16 • Complete inference in SPB 2.0 requires support for the following primitives: – RDFS: subPropertyOf, subClassOf – OWL: TransitiveProperty, SymmetricProperty, sameAs • Any triplestore with OWL 2 RL reasoning support will be able to process SPB v2.0 correctly – In fact, a much reduced rule-set is sufficient for complete reasoning in as in OWL 2 RL there are host of primitives and rules that SPB 2.0 does not make use of. Such examples are all onProperty restrictions, class and property equivalence, property chains, etc. – The popular (but not W3C standardized) OWL Horst profile is sufficient for reasoning with SPB v 2.0 • OWL Horst refers to the pD* entailment defined by Herman Ter Horst in: Completeness, decidability and complexity of entailment for RDF Schema and a semantic extension involving the OWL vocabulary. J. Web Sem. 3(2-3): 79-115 (2005) Mar 2015
  • 17. • Modified versions of the BBC Ontologies – As in SPB v.1.0, BBC Core ontology defines relationships between Person – Organizations – Locations - Events – Those were mapped to corresponding DBPedia and Geonames classes and relationships • bbccore:Thing is defined as super class of dbp-ont:Company, dbp-ont:Event, foaf:Person, geonames-ontology:Feature (geographic feature) – dbp-ont:parentCompany is defined to be owl:TransitiveProperty – These extra definitions are added at the end of the BBC Core ontology • Added a SPB-modified version of the Geonames ontology – Stripped out unnecessary pieces such onProperty restrictions that impose cardinality constraints New and Updated Ontologies LDBC Semantic Publishing Benchmark Evolution #17Mar 2015
  • 18. • Motivation • The New Reference Datasets • Changes in the generated metadata • Ontologies and Required Rule-sets • The Concrete Inference patterns • Changes in the Basic Interactive Query Mix • GraphDB Configuration Disclosure • Future plans Outline #18LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 19. • Transitive closure of location-nesting relationships – geo-ont:parentFeature property is used in GeoNames to link each location to larger locations that it is a part of – geo-ont:parentFeature is owl:TransitiveProperty in the GN ontology – This way if Munich has gn:parentFeature Bavaria, that on its turn has gn:parentFeature Germany, than reasoning should make sure that Germany also appears as gn:parentFeature of Munich • Transitive closure of dbp-ont:ParentCompany – Inference unveils indirect company control relationships • owl:sameAs relations between Geonames features and DBPedia URIs for the same locations: – The standard semantics of owl:sameAs requires that each statement asserted using the Geonames URIs should be inferable/retrievable also with the mapped DBPedia URIs Concrete Inference Patterns LDBC Semantic Publishing Benchmark Evolution #19Mar 2015
  • 20. • The inference patterns from SPB v.1.0 are still there: – cwork:tag statements inferred from each of its sub-properties cwork:about and cwork:mentions – < ?cw rdf:type cwork:CreativeWork > statements inferred for instances of each of its sub-classes – These two are the only reasoning patterns that apply to the generated content metadata; all the others apply only to the Reference Data Concrete Inference Patterns (2) LDBC Semantic Publishing Benchmark Evolution #20Mar 2015
  • 21. • If brute-force materialization is used, the Reference Data expands from 22M statements to about 100M – If owl:sameAs expansion is not considered, materialization on top of the Reference Data adds about 7M statements – most of those coming from the transitive closure of geo-ont:parentFeature – Several triplestores (e.g. ORACLE and GraphDB) that use forward- chaining have specific mechanism, which allow them to handle owl:sameAs reasoning in a sort of hybrid manner, without expanding their indices. After materialization, such engines will have to deal with 29M statements of Reference Data, instead of 100M • Brute-force materialization of the generated metadata describing creative works would double it – In SPB v.1.0 the expansion factor was 1.6, but now there are more entity references and also owl:sameAs equivalents of the location URIs Inference Statistics and Ratios LDBC Semantic Publishing Benchmark Evolution #21Mar 2015
  • 22. • Motivation • The New Reference Datasets • Changes in the generated metadata • Ontologies and Required Rule-sets • The Concrete Inference patterns • Changes in the Basic Interactive Query Mix • GraphDB Configuration Disclosure • Future plans Outline #22LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 23. Modified Queries in Basic Interactive LDBC Semantic Publishing Benchmark Evolution #23 • SPB 2.0 changes only the Basic Interactive Query-Mix – The other query mixes are mostly unchanged • In most of the queries, cwork:about and cwork:mention properties that link creative work to an entity, have been replaced with their supper cwork:tag – This applies also to the Advanced Interactive and the Analytical use cases – For most of the queries, both types of relations are equally relevant – Using the super-property make the query simpler; as compared to other approaches to make a query that uses both (e.g. UNION) Mar 2015
  • 24. New Queries LDBC Semantic Publishing Benchmark Evolution #24 • Basic Interactive query-mix now adds 2 new queries exploring the interconnectedness in reference data • Q10: Retrieve CWs that mention locations in the same province (A.ADM1) as the specified one – There is additional constraint on time interval (5 days) • Q11: Retrieve the most recent CWs that are tagged with entities, related to a specific popular entity – Relations can be inbound and outbound; explicit or inferred Mar 2015
  • 25. Q10: News from the region LDBC Semantic Publishing Benchmark Evolution #25 SELECT ?cw ?title ?dateModified { <http://dbpedia.org/resource/Sofia> geo-ont:parentFeature ?province . ?province geo-ont:featureCode geo-ont:A.ADM1 . { ?location geo-ont:parentFeature ?province . } UNION { BIND(?province as ?location) . } ?cw a cwork:CreativeWork ; cwork:tag ?location ; cwork:title ?title ; cwork:dateModified ?dateModified . FILTER(?dateModified >= "2011-05-14T00:00:00.000"^^<http://www.w3.org/2001/XMLSchema#dateTime> && ?dateModified < "2011-05-19T23:59:59.999"^^<http://www.w3.org/2001/XMLSchema#dateTime>) } LIMIT 100 Mar 2015
  • 26. Q11: News on about related entities LDBC Semantic Publishing Benchmark Evolution #26 SELECT DISTINCT ?cw ?title ?description ?dateModified ?primaryContent { { <http://dbpedia.org/resource/Teresa_Fedor> ?p ?e . } UNION { ?e ?p <http://dbpedia.org/resource/Teresa_Fedor> . } ?e a core:Thing . ?cw cwork:tag ?e ; cwork:title ?title ; cwork:description ?description ; cwork:dateModified ?dateModified ; bbc:primaryContentOf ?primaryContent . } ORDER BY DESC(?dateModified) LIMIT 100 Mar 2015
  • 27. Trivial Validation Opportunities LDBC Semantic Publishing Benchmark Evolution #27 • The validation mechanisms from SPB v.1.0 remain unchanged • In addition to this, the changes to the benchmark are designed such a way that in case that specific inference patterns are not supported by the engine, some of the queries will return zero results – If owl:sameAs inference is not supported Q10 will return 0 results – If transitive properties are not supported, Q10 will return 0 results • Q11 will return smaller number of results (higher ratio of zeros also) – If rdfs:subProperty is not supported, Q2-Q10 will return 0 results – If rdfs:subClass is not supported, Q1, Q2, Q6, Q8 will return no results Mar 2015
  • 28. • Motivation • The New Reference Datasets • Changes in the generated metadata • Ontologies and Required Rule-sets • The Concrete Inference patterns • The New Queries • GraphDB Configuration Disclosure • Future plans Outline #28LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 29. 1. Reasoning performed through forward-chaining with a custom ruleset – This is the rdfs-optimized ruleset of GraphDB with added rules to support transitive, inverse and symmetric properties 2. owl:sameAs optimization of GraphDB is beneficial – This is the standard behavior of GraphDB; but it helps a lot – Query-time tuning is used to disable expansion of results with respect to owl:sameAs equivalent URIs 3. Geo-spatial index in Q6 of the basic interactive mix 4. Lucene Connector for full-text search in Q8 Note: Points #2-#4 above help query time, but slow down updates in GraphDB. As materialization does also GraphDB Configuration LDBC Semantic Publishing Benchmark Evolution #29Mar 2015
  • 30. • Experimental run using GraphDB-SE 6.1 • Using LDBC Scale Factor 1 - (22M reference data + 30M generated data) • Hardware: 32G RAM, Intel i7-4770 CPU @ 3.40GHz, single SSD Drive Samsung 845 DC • Benchmark configuration: 8 reading / 2 writing agents, 1 min warmup, 40 min benchmark • Updates/sec : 13, Selects /sec : 44 GraphDB – First Results LDBC Semantic Publishing Benchmark Evolution #30Mar 2015
  • 31. • Motivation • The New Reference Datasets • Changes in the generated metadata • Ontologies and Required Rule-sets • The Concrete Inference patterns • Changes in the Basic Interactive Query Mix • GraphDB Configuration Disclosure • Future plans Outline #31LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 32. • Relationships between pieces of content – At present Creative Works are related only to Entities, not to other CWs – These could be StoryLines or Collections of MM assets, e.g. an article with few images related to it • Better modelling of content-to-entity cardinalities – Based on data from FT • Enriched query sets and realistic query frequencies – Based on query logs from FT • Loading/updating big dataset in live instances • Testing high-availability – FT acceptance tests for High Availability cluster are great starting point Future Plans #32LDBC Semantic Publishing Benchmark Evolution Mar 2015
  • 33. • Much bigger reference dataset: from 170k to 22M – 7M statement from GeoNames about Europe – 14M statements from DBpedia for Companies, Persons, Events • Interconnected reference data: more than 5M links – owl:sameAs links between DBPedia and Geonames • Much more comprehensive usage of inference – While still the simplest possible flavor of OWL is used • rdfs:subClassOf, rdfs:subPorpertyOf, owl:TransitiveProperty, owl:sameAs – Transitive closure over company control and geographic nesting – Simpler queries through usage of super-property • Retrieval of relevant content through links in the reference data Summary of SPB v.2.0 Changes #33LDBC Semantic Publishing Benchmark Evolution Mar 2015