SlideShare une entreprise Scribd logo
1  sur  31
Live Linked Data
Making Linked Data writable with eventual consistency
guarantees
Luis-Daniel Ibáñez, Nantes University, GDD team

Luis-Daniel Ibáñez, Hala Skaf-Molli, Pascal Molli, and Olivier Corby. Synchronizing semantic stores with
commutative replicated data types. In Semantic Web Collaborative Spaces (SWCS) @ WWW’12
Luis-Daniel Ibáñez, Hala Skaf-Molli, Pascal Molli, and Olivier Corby. Live Linked Data: Synchronizing
semantic stores with Commutative Replicated Data Types. Special Issue on Resource Discovery, IJMSO
(to appear)
Linked Data

Autonomous participants
agree on a set of principles
for publishing RDF
data, allowing its querying
and browsing across
distributed servers.
In 2011, More than 30 billion
RDF triples distributed
between ~700 autonomous
participants1.
1Billion

Triple Challenge + Linked Open Data Metadata
T. Käfer et. Al., Towards a Dynamic Linked Data Observatory, LDOW@WWW12
LOD - 2007 October
LOD - 2011 September
Here is a list of links created by LinkedMDB that relate the movie "The Shining" to
other open datasets:
owl:SameAs to "The_Shining_(film)" on DBpedia
owl:SameAs to "The_Shining_(film)" YAGO
dbpedia:hasPhotoCollection to The movie's photos on flickr™ wrappr
movie:relatedBook to the movie's related book on RDF Book Mashup
owl:SameAs from one of the music composers of the movie, Béla
Bartók, toMusicBrainz: http://zitgist.com/music/artist/fd14da1b-3c2d-4cc8-9ca6fc8c62ce6988
rdfs:SeeAlso from one of the writers of the movie, Stanley Kubrick to RDF Book
Mashup: http://www4.wiwiss.fu-berlin.de/bookmashup/persons/Stanley+Kubrick
foaf:based_near to the locations of the movie in the US and
GB:http://sws.geonames.org/2635167/ and http://sws.geonames.org/6252001/
movie:language to the language of the movie on
lingvoj.org:http://www.lingvoj.org/lingvo/en
Source : http://wiki.linkedmdb.org/Main/Interlinking
Querying Linked Data
Freshness vs. Performance
Dbpedia
English

Local Copying 
Not Fresh

Dbpedia
Spanish

Free
Base

Wolfram

My Super Cool
Query Answerer
about Venezuela

Distributed
querying 
Performance and
reliability
Querying Linked Data
Data Quality
Dbpedia
English

302km2, 439,947h

All different =S
Dbpedia
Spanish

Free
Base

311km2, 407,577h

Unknown, 439,947h

What is the area
and population
of Girardot
Municipality?

314km2, 407,109h
Wolfram

302km2, 396,095h
Open Data
Venezuela

All wrong!!!
Updating Linked Data
Dbpedia
English

Dbpedia
Spanish

Free
Base

你是什么意
思?

Update
Girardot
municipality’s data
to the official
values.

314km2, 407,109h
Wolfram
Open Data
Venezuela

Editing other’s
datasets is generally
not possible.
How to make Linked
Data Writable?
How to move from
Linked Data 1.0 to
Linked Data 2.0?
Live Linked Data Approach (1)
SPARQL Update 1.1 allows to update RDF data locally.
Update feeds to provide streams of updates (e.g.
DBPedia Live) ->
Allow data freshness.
Processing is local
The writing is “indirect”, only if the others decides to
consume my stream I can update their datasets.
Live Linked Data

DELETE {
Dbpediaen/Girardot area ?a1 .
Dbpediaen/Girardot popul ?p1 .
?ent area ?a2 .
?ent popul ?p2 .
WHERE { ?ent sameAs Dbpediaen/Girardot }}
;
INSERT DATA {
ODVenezuela/Girardot area 314
Dbpedia
ODVenezuela/Girardot popul 407109
English
ODVenezuela/Girardot sameAs Dbpediaen/Girardot
ODVenezuela/Girardot sameAs Freebase/Girardot
Dbpediaen/Girardot area 302
Dbpediaen/Girardot popul 439947 … }

Would you follow my changes?
They are official… ;)

“Follow your change”

SPARQL Update Stream
consumption  Freshness

Dbpedia
Dbpediaes/Girardot area 311
Spanish
Dbpediaes/Girardot popul 407577

Great!
Freebase/Girardot area 308
Free Freebase/Girardot area “null”
Freebase/Girardot popul 439947 ODVenezuela/Girardot sameAs Dbpediaen/Girardot
Base
ODVenezuela/Girardot sameAs Freebase/Girardot
…

Dbpediaen/Girardot sameAs Dbpediaes/Girardot
Dbpediaen/Girardot sameAs Freebase/Girardot
…

Wolfram

Wolfram/Girardot area 302
Wolfram/Girardot popul 396095

ODVenezuela/Girardot area 314
Open Data ODVenezuela/Girardot popul 407109
Venezuela

Local Processing 
Performance
SPARQL Update Stream
publishing 
Write enabling
Live Linked Data
complements LOD Cloud

Follow your
changes
Live Linked Data Overview

A social network of linked data participants based on
a « follow your change » relation.
Makes Linked Data a consistent « read/write » space :
from linked data 1.0 to linked data 2.0
Creates assemblies of datasets and enable
« synchronize and search » paradigm: between
warehousing approach and distributed queries
approach.
The Forgotten Issue:
Consistency
Synchronizing semantic
sources is not new…
RDFSync
Sparql Push (PubSubHub)
ResourceSync…

But none of them addresses
the consistency when
concurrent operations or
network delivery-in-disorder
occur.

Source1

Source2

Insert DATA {a b c}
Delete DATA {a b c}

Insert DATA {a b c}

abc

≠

abc

Output depends on order of reception
Keeping Consistent
Consistency = Convergence + Intention preservation.
Strong Consistency
Requires blocking. Expensive

Eventual Consistency
Apply operations without blocking
If conflict detected, resolve.
Can be very expensive at large scale in network’s size.

If the type that we are dealing is Conflict-Free, no worries
about conflict detection and resolution.
Conflict-Free Replicated
Data Types1
An abstract data type where:
The application of all operations over any state commutes,
S o opi o opj = S o opj o opi => Convergence.
Caveat1: Intention preservation should be checked
separately.
Caveat2: Should be useful, by itself or by corresponding to
the concurrent semantic of an useful type (set, graph…).
Caveat2.1: Should be affordable.

Enable massive optimistic (eventual) replication.
1 M.

Shapiro, N. Preguiça, C. Baquero, M. Zawirski. Conflict-free Replicated Data Types. SSS 2011.
Enabling LLD with CRDTs

Construct a CRDT for RDF Graph Stores with SPARQL UPDATE 1.1 Operations
with minimal overhead under Live Linked Data conditions:
Unknown but steadily growing number of autonomous participants.
No central controls (coordination, primary replicas, small core of datacenters)
Follow your change relation induces the network.

Dynamicity parameters:
Degree of change / Change frequency  varies between producers.
Growth rate  Positive, knowledge grows.
RDF Datasets
Assume there are pairwise disjoint infinite sets I , B, and L
(IRIs, Blank nodes, and Literals, respectively). A triple (s,p,o) ∈
(I∪B)×I×(I∪B∪L) is called an RDF triple. In this tuple, s is the
subject, p the predicate, and o the object1.
An RDF Graph is a set of RDF triples1.
An RDF dataset is a set: { G, (<u1>, G1), (<u2>, G2), . . . (<un>, Gn) }
where G and each Gi are graphs, and each <ui> is an IRI. Each <ui>
is distinct. G, is called the “default graph” and the pairs (<ui>, Gi)
are called “named graphs”2.
1 J. Pérez, M. Arenas and C. Gutiérrez, Semantics and Complexity of SPARQL, ACM

Transactions on DB Systems, 2009
2 http://www.w3.org/TR/sparql11-query/#sparqlDataset
SPARQL Update Queries
Syntax

Formal Operation

INSERT DATA QuadData

Dataset-UNION(GS, Dataset(QuadData,{},GS,GS))

DELETE DATA QuadData

Dataset-DIFF(GS, Dataset(QuadData,{},GS,GS))

DELETE QuadPatternDEL INSERT QuadPatternINS
WHERE GroupGraphPattern

Dataset-UNION(Dataset-DIFF(GS,
Dataset(QuadPatternDEL,GroupGraphPattern,DS,GS)),
Dataset(QuadPatternINS, GroupGraphPattern,DS,GS)

DELETE QuadPatternDEL
WHERE GroupGraphPattern

Dataset-UNION(Dataset-DIFF(GS,
Dataset(QuadPatternDEL,GroupGraphPattern,DS,GS)),
Dataset({}, GroupGraphPattern,DS,GS)

INSERT QuadPatternINS UsingClause*
WHERE GroupGraphPattern

Dataset-UNION(Dataset-DIFF(GS,
Dataset({},GroupGraphPattern,DS,GS)),
Dataset(QuadPatternINS, GroupGraphPattern,DS,GS)

LOAD (SILENT)? IRIref

Dataset-UNION(GS, { graph(documentIRI) } )

CLEAR (SILENT)? DEFAULT

{{}} union {(irii, Gi) | 1 ≤ i ≤ n}

CREATE (SILENT)? IRIref

GS union {(iri, {})} if iri not in graphNames(GS); otherwise,
OpCreate(GS, iri) = GS

DROP (SILENT)? IRIref

GS if iri not in graphNames(GS); otherwise, OpDrop(GS, irij)
= {DG} union {(irii, Gi ) | i ≠ j and 1 ≤ i ≤ n}

http://www.w3.org/TR/sparql11-update/#formalModel
Insert Ground Triples
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX ns: <http://example.org/ns#>
INSERT DATA
{ GRAPH <http://example/bookStore>
{ <http://example/book1> ns:price 42 .
<http://example/book1> dc:creator “A.N.Other” . } }
Data before:
# Graph: http://example/bookStore
@prefix dc: <http://purl.org/dc/elements/1.1/> .
<http://example/book1> dc:title "Fundamentals of Compiler Design" .
Data after:
# Graph: http://example/bookStore
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix ns: <http://example.org/ns#> .
<http://example/book1> dc:title "Fundamentals of Compiler Design" .
<http://example/book1> ns:price 42 .
<http://example/book1> dc:creator “A.N.Other” 42 .

http://www.w3.org/TR/sparql11-update/
Delete-Insert
PREFIX foaf: http://xmlns.com/foaf/0.1/
WITH <http://example/addresses>
DELETE { ?person foaf:givenName 'Bill' }
INSERT { ?person foaf:givenName 'William' }
WHERE { ?person foaf:givenName 'Bill’ }

Data before:
Data after:
# Graph: http://example/addresses
# Graph: http://example/addresses
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<http://example/president25> foaf:givenName "Bill" .
<http://example/president25> foaf:givenName "William" .
<http://example/president25> foaf:familyName "McKinley" .<http://example/president25> foaf:familyName "McKinley" .
<http://example/president27> foaf:givenName "Bill" .
<http://example/president27> foaf:givenName "William" .
<http://example/president27> foaf:familyName "Taft" .
<http://example/president27> foaf:familyName "Taft" .
<http://example/president42> foaf:givenName "Bill" .
<http://example/president42> foaf:givenName "William" .
<http://example/president42> foaf:familyName "Clinton" . <http://example/president42> foaf:familyName "Clinton" .
Enabling Intentions Preservation in
LLD
Observed effect of SPARQL update queries at generation
time should be preserved at re-execution time…
If an Sparql query inserts a triple at generation time, this
triple will be added at all sites whatever the state of the store
at re-execution time…

Intentions can be impossible to preserve (nondeterministic operations) or impossible without more
order constraints (re-execution time evaluation)
PREFIX foaf: http://xmlns.com/foaf/0.1/
WITH <http://example/addresses>
DELETE { ?person foaf:givenName 'Bill' }
INSERT { ?person foaf:givenName 'William' }
WHERE { ?person foaf:givenName 'Bill’ }

PREFIX dc: <http://purl.org/dc/elements/1.1/>
INSERT DATA {
_:book1 dc:title "A new book" ;
dc:creator "A.N.Other" .
}
A CRDT for SPARQL update
Rewrite SPARQL UPDATE Operations to another type
that does commute, while preserving the original
semantics.
Graph Update operations work over RDF-Graphs, and
can be expressed in terms of the basic INSERT and
DELETE of ground triples.
RDF-Graphs are Sets of RDF-Triples, and CRDTs for Sets
with Insert and Delete already exists…
SU-Set : A Sparql-Update CRDT

Same id for all triples inserted
together, unicity is preserved
Less expensive in
Communication

Need to send (triple,id) when
deleting
More expensive in
communication

Better for LLD as
knowledge grows…
SU-Set
Implementation

SU-Set embedded into Corese, a reference
implementation of RDF-Stores and SPARQL Update
by Wimmics team.
Rewrite Sparql Update into SUSet, execute and log.
“Follow your change” implemented with anti-entropy
over logs and version vectors.
How much we need to pay to have
eventual consistency ?

Time Overhead :
Adding an id to each element is linear.
Selection and lookup is not affected by many pairs with
the same triple.

Round and # of messages Overhead :
Convergence after one round, one message per
operation → Optimal
Validation – Space Overhead

32 bytes per 1 billion triples =
32 GB → 1 Ipod
Two UUIDs, 16 bytes each
Semantic Stores already use
an internal id → Reuse it
(UUID1,UUID2)
Extra pairs produced by
concurrent insertions could
cause problems...
Counter
Site identifier
A version vector for each
(monotonically
(Useful for authoring too)
site: Max size= number of
increasing)
participants (~300-800).
Dynamicity: DBPedia Live
Framework to register changes in DBPedia in near
real-time.
Generates one file with inserted triples and one with
deleted triples approximately each 10 seconds.
No pattern operations logged  No Overhead here.

Many more insertions than deletions
Good for SU-Set.

Many triples inserted per operation
Even better for SU-Set
SU-Set Communication overhead
on DBPedia Live
7 days of streaming
No concurrent insertions
IdSize1: 0,096 Kb
Avg triple size1: 0,155 Kb

Size (MB)

# of Triples

21957
Inserts

21762190

3294,08

5334,29

3296,6

21957
Deletes

1755888

265,78

164,61

430,4

54,47%

4,68%

Overhead

1Measured

Without ids 1 id per triple

1 id per
operation

Operation

with ‘wc –c’ over DBPedia Live log files
Conclusion

Live Linked Data makes linked data “writable” and
allows a new query paradigm.
SU-Set is a CRDT for RDF-Graphs updated with
SPARQL-Update 1.1 that ensure eventual consistency
(convergence + intentions) on Live Linked Data at a
reasonable cost.
Future Work

Finish and release LLD-Corese.
Write the composed CRDT for RDF-Datasets
What other benefits LLD has for:
Querying (SyncAndSearch vs Federations, connection
with LAV/GUN, with Adaptivity…?)
Discovery…?
Community Detection…?
Provenance…?
Traces

Contenu connexe

Tendances

Integration of data ninja services with oracle spatial and graph
Integration of data ninja services with oracle spatial and graphIntegration of data ninja services with oracle spatial and graph
Integration of data ninja services with oracle spatial and graphData Ninja API
 
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives TaiwanA Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwanandrea huang
 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online NewsBernardo Najlis
 
Data Trajectories: tracking the reuse of published data for transitive credi...
Data Trajectories: tracking the reuse of published datafor transitive credi...Data Trajectories: tracking the reuse of published datafor transitive credi...
Data Trajectories: tracking the reuse of published data for transitive credi...Paolo Missier
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitGreg Landrum
 
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...Bernardo Najlis
 
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...William Yetman
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? Robert Grossman
 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop DatabasesMartín Rezk
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Mariano Rodriguez-Muro
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Databricks
 
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...Yann Pauly
 
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...Gilles Fedak
 
Python in big data world
Python in big data worldPython in big data world
Python in big data worldRohit
 
On the diversity and availability of temporal information in linked open data
On the diversity and availability of temporal information in linked open dataOn the diversity and availability of temporal information in linked open data
On the diversity and availability of temporal information in linked open dataAnisa Rula
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Josef Hardi
 
IU Data Visualization Class Final Project: Visualizing Missing Species Intera...
IU Data Visualization Class Final Project: Visualizing Missing Species Intera...IU Data Visualization Class Final Project: Visualizing Missing Species Intera...
IU Data Visualization Class Final Project: Visualizing Missing Species Intera...James Nelson
 
Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...Nikolaos Konstantinou
 

Tendances (20)

Integration of data ninja services with oracle spatial and graph
Integration of data ninja services with oracle spatial and graphIntegration of data ninja services with oracle spatial and graph
Integration of data ninja services with oracle spatial and graph
 
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives TaiwanA Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
 
Named Entity Recognition from Online News
Named Entity Recognition from Online NewsNamed Entity Recognition from Online News
Named Entity Recognition from Online News
 
Data Trajectories: tracking the reuse of published data for transitive credi...
Data Trajectories: tracking the reuse of published datafor transitive credi...Data Trajectories: tracking the reuse of published datafor transitive credi...
Data Trajectories: tracking the reuse of published data for transitive credi...
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKit
 
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
Social Media World News Impact on Stock Index Values - Investment Fund Analyt...
 
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
Scaling AncestryDNA with the Hadoop Ecosystem. Presented at the San Jose Hado...
 
ontop: A tutorial
ontop: A tutorialontop: A tutorial
ontop: A tutorial
 
Spark
SparkSpark
Spark
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care?
 
Ontologies Ontop Databases
Ontologies Ontop DatabasesOntologies Ontop Databases
Ontologies Ontop Databases
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
 
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
Apache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthca...
 
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
OVH-Change Data Capture in production with Apache Flink - Meetup Rennes 2019-...
 
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
Active Data: Managing Data-Life Cycle on Heterogeneous Systems and Infrastruc...
 
Python in big data world
Python in big data worldPython in big data world
Python in big data world
 
On the diversity and availability of temporal information in linked open data
On the diversity and availability of temporal information in linked open dataOn the diversity and availability of temporal information in linked open data
On the diversity and availability of temporal information in linked open data
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!
 
IU Data Visualization Class Final Project: Visualizing Missing Species Intera...
IU Data Visualization Class Final Project: Visualizing Missing Species Intera...IU Data Visualization Class Final Project: Visualizing Missing Species Intera...
IU Data Visualization Class Final Project: Visualizing Missing Species Intera...
 
Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...Transient and persistent RDF views over relational databases in the context o...
Transient and persistent RDF views over relational databases in the context o...
 

En vedette

Sander Klous - Op de 1e editie van Data Donderdag
Sander Klous - Op de 1e editie van Data DonderdagSander Klous - Op de 1e editie van Data Donderdag
Sander Klous - Op de 1e editie van Data DonderdagDataDonderdag
 
Line and staff decentralasation
Line and staff decentralasationLine and staff decentralasation
Line and staff decentralasationVishal kakade
 
FTGU Services Presentation
FTGU Services PresentationFTGU Services Presentation
FTGU Services PresentationJamie Lynch
 
Unit 1 network models & typical examples(part a)
Unit 1 network models & typical examples(part a)Unit 1 network models & typical examples(part a)
Unit 1 network models & typical examples(part a)Vishal kakade
 

En vedette (8)

Sander Klous - Op de 1e editie van Data Donderdag
Sander Klous - Op de 1e editie van Data DonderdagSander Klous - Op de 1e editie van Data Donderdag
Sander Klous - Op de 1e editie van Data Donderdag
 
Line and staff decentralasation
Line and staff decentralasationLine and staff decentralasation
Line and staff decentralasation
 
Aliran kritisisme
Aliran kritisismeAliran kritisisme
Aliran kritisisme
 
The Value of Data
The Value of Data The Value of Data
The Value of Data
 
FTGU Services Presentation
FTGU Services PresentationFTGU Services Presentation
FTGU Services Presentation
 
Unit 0 introduction
Unit 0 introductionUnit 0 introduction
Unit 0 introduction
 
Personal hygiene
Personal hygienePersonal hygiene
Personal hygiene
 
Unit 1 network models & typical examples(part a)
Unit 1 network models & typical examples(part a)Unit 1 network models & typical examples(part a)
Unit 1 network models & typical examples(part a)
 

Similaire à LiveLinkedData - TransWebData - Nantes 2013

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...The HDF-EOS Tools and Information Center
 
HEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkHEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkEamonn Maguire
 
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache HadoopA Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache HadoopIJTET Journal
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists jlacefie
 
Hadoop - Past, Present and Future - v1.1
Hadoop - Past, Present and Future - v1.1Hadoop - Past, Present and Future - v1.1
Hadoop - Past, Present and Future - v1.1Big Data Joe™ Rossi
 
Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge BasesExplanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge BasesDaniel Sonntag
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabImpetus Technologies
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
 
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...Dipayan Dev
 
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking AlgorithmPerformance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking AlgorithmIRJET Journal
 
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and HadoopA gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and HadoopStefano Paluello
 
On the fly n w mvd identification for reducing data redundancy
On the fly n w mvd identification for reducing data redundancyOn the fly n w mvd identification for reducing data redundancy
On the fly n w mvd identification for reducing data redundancyijdms
 
IBC FAIR Data Prototype Implementation slideshow
IBC FAIR Data Prototype Implementation   slideshowIBC FAIR Data Prototype Implementation   slideshow
IBC FAIR Data Prototype Implementation slideshowMark Wilkinson
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics EnvironmentIan Foster
 
Foss4G 2009 Scenz Grid
Foss4G 2009 Scenz GridFoss4G 2009 Scenz Grid
Foss4G 2009 Scenz Gridnoho
 
Empowering Transformational Science
Empowering Transformational ScienceEmpowering Transformational Science
Empowering Transformational ScienceChelle Gentemann
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationRob Emanuele
 

Similaire à LiveLinkedData - TransWebData - Nantes 2013 (20)

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HEPData workshop talk
HEPData workshop talkHEPData workshop talk
HEPData workshop talk
 
HEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 TalkHEPData Open Repositories 2016 Talk
HEPData Open Repositories 2016 Talk
 
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache HadoopA Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
A Survey on Approaches for Frequent Item Set Mining on Apache Hadoop
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
 
Hadoop - Past, Present and Future - v1.1
Hadoop - Past, Present and Future - v1.1Hadoop - Past, Present and Future - v1.1
Hadoop - Past, Present and Future - v1.1
 
Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge BasesExplanations in Dialogue Systems through Uncertain RDF Knowledge Bases
Explanations in Dialogue Systems through Uncertain RDF Knowledge Bases
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...
 
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
Dr.Hadoop- an infinite scalable metadata management for Hadoop-How the baby e...
 
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking AlgorithmPerformance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
Performance Improvement of Heterogeneous Hadoop Cluster using Ranking Algorithm
 
A gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and HadoopA gentle introduction to the world of BigData and Hadoop
A gentle introduction to the world of BigData and Hadoop
 
On the fly n w mvd identification for reducing data redundancy
On the fly n w mvd identification for reducing data redundancyOn the fly n w mvd identification for reducing data redundancy
On the fly n w mvd identification for reducing data redundancy
 
Huhadoop - v1.1
Huhadoop - v1.1Huhadoop - v1.1
Huhadoop - v1.1
 
IBC FAIR Data Prototype Implementation slideshow
IBC FAIR Data Prototype Implementation   slideshowIBC FAIR Data Prototype Implementation   slideshow
IBC FAIR Data Prototype Implementation slideshow
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics Environment
 
Foss4G 2009 Scenz Grid
Foss4G 2009 Scenz GridFoss4G 2009 Scenz Grid
Foss4G 2009 Scenz Grid
 
Empowering Transformational Science
Empowering Transformational ScienceEmpowering Transformational Science
Empowering Transformational Science
 
Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
 

Plus de Luis Daniel Ibáñez

Qrowd and thecity_connectedsmartcities_2019
Qrowd and thecity_connectedsmartcities_2019Qrowd and thecity_connectedsmartcities_2019
Qrowd and thecity_connectedsmartcities_2019Luis Daniel Ibáñez
 
Simplifying maps tomtomgeomonday_2019
Simplifying maps tomtomgeomonday_2019Simplifying maps tomtomgeomonday_2019
Simplifying maps tomtomgeomonday_2019Luis Daniel Ibáñez
 
Hybrid workflows aw4city-webconf-2019
Hybrid workflows aw4city-webconf-2019Hybrid workflows aw4city-webconf-2019
Hybrid workflows aw4city-webconf-2019Luis Daniel Ibáñez
 
Inclusive cities thecityofthefuture_cardiff_2018
Inclusive cities thecityofthefuture_cardiff_2018Inclusive cities thecityofthefuture_cardiff_2018
Inclusive cities thecityofthefuture_cardiff_2018Luis Daniel Ibáñez
 
Human factorsession introduction-bdv-forum-sofia
Human factorsession introduction-bdv-forum-sofiaHuman factorsession introduction-bdv-forum-sofia
Human factorsession introduction-bdv-forum-sofiaLuis Daniel Ibáñez
 
Human factor in big data qrowd bdve
Human factor in big data qrowd bdveHuman factor in big data qrowd bdve
Human factor in big data qrowd bdveLuis Daniel Ibáñez
 
Qrowd human factorsession-bdv-forum-2018
Qrowd human factorsession-bdv-forum-2018Qrowd human factorsession-bdv-forum-2018
Qrowd human factorsession-bdv-forum-2018Luis Daniel Ibáñez
 
Qrowd transport session-bdv-forum-2018
Qrowd transport session-bdv-forum-2018Qrowd transport session-bdv-forum-2018
Qrowd transport session-bdv-forum-2018Luis Daniel Ibáñez
 
Smart modalsplit smarmobilitycongress_2018
Smart modalsplit smarmobilitycongress_2018Smart modalsplit smarmobilitycongress_2018
Smart modalsplit smarmobilitycongress_2018Luis Daniel Ibáñez
 

Plus de Luis Daniel Ibáñez (13)

Qrowd data sharing ebdvf 2017
Qrowd data sharing ebdvf 2017Qrowd data sharing ebdvf 2017
Qrowd data sharing ebdvf 2017
 
Policy workshop ebdvf2019
Policy workshop ebdvf2019Policy workshop ebdvf2019
Policy workshop ebdvf2019
 
Qrowd and thecity_connectedsmartcities_2019
Qrowd and thecity_connectedsmartcities_2019Qrowd and thecity_connectedsmartcities_2019
Qrowd and thecity_connectedsmartcities_2019
 
Simplifying maps tomtomgeomonday_2019
Simplifying maps tomtomgeomonday_2019Simplifying maps tomtomgeomonday_2019
Simplifying maps tomtomgeomonday_2019
 
Qrowd ppp week-2018
Qrowd ppp week-2018Qrowd ppp week-2018
Qrowd ppp week-2018
 
Hybrid workflows aw4city-webconf-2019
Hybrid workflows aw4city-webconf-2019Hybrid workflows aw4city-webconf-2019
Hybrid workflows aw4city-webconf-2019
 
Inclusive cities thecityofthefuture_cardiff_2018
Inclusive cities thecityofthefuture_cardiff_2018Inclusive cities thecityofthefuture_cardiff_2018
Inclusive cities thecityofthefuture_cardiff_2018
 
Human factorsession introduction-bdv-forum-sofia
Human factorsession introduction-bdv-forum-sofiaHuman factorsession introduction-bdv-forum-sofia
Human factorsession introduction-bdv-forum-sofia
 
Human factor in big data qrowd bdve
Human factor in big data qrowd bdveHuman factor in big data qrowd bdve
Human factor in big data qrowd bdve
 
Qrowd human factorsession-bdv-forum-2018
Qrowd human factorsession-bdv-forum-2018Qrowd human factorsession-bdv-forum-2018
Qrowd human factorsession-bdv-forum-2018
 
Qrowd transport session-bdv-forum-2018
Qrowd transport session-bdv-forum-2018Qrowd transport session-bdv-forum-2018
Qrowd transport session-bdv-forum-2018
 
Qrowd overview-solothurn-2018
Qrowd overview-solothurn-2018Qrowd overview-solothurn-2018
Qrowd overview-solothurn-2018
 
Smart modalsplit smarmobilitycongress_2018
Smart modalsplit smarmobilitycongress_2018Smart modalsplit smarmobilitycongress_2018
Smart modalsplit smarmobilitycongress_2018
 

Dernier

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 

Dernier (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 

LiveLinkedData - TransWebData - Nantes 2013

  • 1. Live Linked Data Making Linked Data writable with eventual consistency guarantees Luis-Daniel Ibáñez, Nantes University, GDD team Luis-Daniel Ibáñez, Hala Skaf-Molli, Pascal Molli, and Olivier Corby. Synchronizing semantic stores with commutative replicated data types. In Semantic Web Collaborative Spaces (SWCS) @ WWW’12 Luis-Daniel Ibáñez, Hala Skaf-Molli, Pascal Molli, and Olivier Corby. Live Linked Data: Synchronizing semantic stores with Commutative Replicated Data Types. Special Issue on Resource Discovery, IJMSO (to appear)
  • 2. Linked Data Autonomous participants agree on a set of principles for publishing RDF data, allowing its querying and browsing across distributed servers. In 2011, More than 30 billion RDF triples distributed between ~700 autonomous participants1. 1Billion Triple Challenge + Linked Open Data Metadata T. Käfer et. Al., Towards a Dynamic Linked Data Observatory, LDOW@WWW12
  • 3. LOD - 2007 October
  • 4. LOD - 2011 September
  • 5. Here is a list of links created by LinkedMDB that relate the movie "The Shining" to other open datasets: owl:SameAs to "The_Shining_(film)" on DBpedia owl:SameAs to "The_Shining_(film)" YAGO dbpedia:hasPhotoCollection to The movie's photos on flickr™ wrappr movie:relatedBook to the movie's related book on RDF Book Mashup owl:SameAs from one of the music composers of the movie, Béla Bartók, toMusicBrainz: http://zitgist.com/music/artist/fd14da1b-3c2d-4cc8-9ca6fc8c62ce6988 rdfs:SeeAlso from one of the writers of the movie, Stanley Kubrick to RDF Book Mashup: http://www4.wiwiss.fu-berlin.de/bookmashup/persons/Stanley+Kubrick foaf:based_near to the locations of the movie in the US and GB:http://sws.geonames.org/2635167/ and http://sws.geonames.org/6252001/ movie:language to the language of the movie on lingvoj.org:http://www.lingvoj.org/lingvo/en Source : http://wiki.linkedmdb.org/Main/Interlinking
  • 6. Querying Linked Data Freshness vs. Performance Dbpedia English Local Copying  Not Fresh Dbpedia Spanish Free Base Wolfram My Super Cool Query Answerer about Venezuela Distributed querying  Performance and reliability
  • 7. Querying Linked Data Data Quality Dbpedia English 302km2, 439,947h All different =S Dbpedia Spanish Free Base 311km2, 407,577h Unknown, 439,947h What is the area and population of Girardot Municipality? 314km2, 407,109h Wolfram 302km2, 396,095h Open Data Venezuela All wrong!!!
  • 8. Updating Linked Data Dbpedia English Dbpedia Spanish Free Base 你是什么意 思? Update Girardot municipality’s data to the official values. 314km2, 407,109h Wolfram Open Data Venezuela Editing other’s datasets is generally not possible. How to make Linked Data Writable? How to move from Linked Data 1.0 to Linked Data 2.0?
  • 9. Live Linked Data Approach (1) SPARQL Update 1.1 allows to update RDF data locally. Update feeds to provide streams of updates (e.g. DBPedia Live) -> Allow data freshness. Processing is local The writing is “indirect”, only if the others decides to consume my stream I can update their datasets.
  • 10. Live Linked Data DELETE { Dbpediaen/Girardot area ?a1 . Dbpediaen/Girardot popul ?p1 . ?ent area ?a2 . ?ent popul ?p2 . WHERE { ?ent sameAs Dbpediaen/Girardot }} ; INSERT DATA { ODVenezuela/Girardot area 314 Dbpedia ODVenezuela/Girardot popul 407109 English ODVenezuela/Girardot sameAs Dbpediaen/Girardot ODVenezuela/Girardot sameAs Freebase/Girardot Dbpediaen/Girardot area 302 Dbpediaen/Girardot popul 439947 … } Would you follow my changes? They are official… ;) “Follow your change” SPARQL Update Stream consumption  Freshness Dbpedia Dbpediaes/Girardot area 311 Spanish Dbpediaes/Girardot popul 407577 Great! Freebase/Girardot area 308 Free Freebase/Girardot area “null” Freebase/Girardot popul 439947 ODVenezuela/Girardot sameAs Dbpediaen/Girardot Base ODVenezuela/Girardot sameAs Freebase/Girardot … Dbpediaen/Girardot sameAs Dbpediaes/Girardot Dbpediaen/Girardot sameAs Freebase/Girardot … Wolfram Wolfram/Girardot area 302 Wolfram/Girardot popul 396095 ODVenezuela/Girardot area 314 Open Data ODVenezuela/Girardot popul 407109 Venezuela Local Processing  Performance SPARQL Update Stream publishing  Write enabling
  • 11. Live Linked Data complements LOD Cloud Follow your changes
  • 12. Live Linked Data Overview A social network of linked data participants based on a « follow your change » relation. Makes Linked Data a consistent « read/write » space : from linked data 1.0 to linked data 2.0 Creates assemblies of datasets and enable « synchronize and search » paradigm: between warehousing approach and distributed queries approach.
  • 13. The Forgotten Issue: Consistency Synchronizing semantic sources is not new… RDFSync Sparql Push (PubSubHub) ResourceSync… But none of them addresses the consistency when concurrent operations or network delivery-in-disorder occur. Source1 Source2 Insert DATA {a b c} Delete DATA {a b c} Insert DATA {a b c} abc ≠ abc Output depends on order of reception
  • 14. Keeping Consistent Consistency = Convergence + Intention preservation. Strong Consistency Requires blocking. Expensive Eventual Consistency Apply operations without blocking If conflict detected, resolve. Can be very expensive at large scale in network’s size. If the type that we are dealing is Conflict-Free, no worries about conflict detection and resolution.
  • 15. Conflict-Free Replicated Data Types1 An abstract data type where: The application of all operations over any state commutes, S o opi o opj = S o opj o opi => Convergence. Caveat1: Intention preservation should be checked separately. Caveat2: Should be useful, by itself or by corresponding to the concurrent semantic of an useful type (set, graph…). Caveat2.1: Should be affordable. Enable massive optimistic (eventual) replication. 1 M. Shapiro, N. Preguiça, C. Baquero, M. Zawirski. Conflict-free Replicated Data Types. SSS 2011.
  • 16. Enabling LLD with CRDTs Construct a CRDT for RDF Graph Stores with SPARQL UPDATE 1.1 Operations with minimal overhead under Live Linked Data conditions: Unknown but steadily growing number of autonomous participants. No central controls (coordination, primary replicas, small core of datacenters) Follow your change relation induces the network. Dynamicity parameters: Degree of change / Change frequency  varies between producers. Growth rate  Positive, knowledge grows.
  • 17. RDF Datasets Assume there are pairwise disjoint infinite sets I , B, and L (IRIs, Blank nodes, and Literals, respectively). A triple (s,p,o) ∈ (I∪B)×I×(I∪B∪L) is called an RDF triple. In this tuple, s is the subject, p the predicate, and o the object1. An RDF Graph is a set of RDF triples1. An RDF dataset is a set: { G, (<u1>, G1), (<u2>, G2), . . . (<un>, Gn) } where G and each Gi are graphs, and each <ui> is an IRI. Each <ui> is distinct. G, is called the “default graph” and the pairs (<ui>, Gi) are called “named graphs”2. 1 J. Pérez, M. Arenas and C. Gutiérrez, Semantics and Complexity of SPARQL, ACM Transactions on DB Systems, 2009 2 http://www.w3.org/TR/sparql11-query/#sparqlDataset
  • 18. SPARQL Update Queries Syntax Formal Operation INSERT DATA QuadData Dataset-UNION(GS, Dataset(QuadData,{},GS,GS)) DELETE DATA QuadData Dataset-DIFF(GS, Dataset(QuadData,{},GS,GS)) DELETE QuadPatternDEL INSERT QuadPatternINS WHERE GroupGraphPattern Dataset-UNION(Dataset-DIFF(GS, Dataset(QuadPatternDEL,GroupGraphPattern,DS,GS)), Dataset(QuadPatternINS, GroupGraphPattern,DS,GS) DELETE QuadPatternDEL WHERE GroupGraphPattern Dataset-UNION(Dataset-DIFF(GS, Dataset(QuadPatternDEL,GroupGraphPattern,DS,GS)), Dataset({}, GroupGraphPattern,DS,GS) INSERT QuadPatternINS UsingClause* WHERE GroupGraphPattern Dataset-UNION(Dataset-DIFF(GS, Dataset({},GroupGraphPattern,DS,GS)), Dataset(QuadPatternINS, GroupGraphPattern,DS,GS) LOAD (SILENT)? IRIref Dataset-UNION(GS, { graph(documentIRI) } ) CLEAR (SILENT)? DEFAULT {{}} union {(irii, Gi) | 1 ≤ i ≤ n} CREATE (SILENT)? IRIref GS union {(iri, {})} if iri not in graphNames(GS); otherwise, OpCreate(GS, iri) = GS DROP (SILENT)? IRIref GS if iri not in graphNames(GS); otherwise, OpDrop(GS, irij) = {DG} union {(irii, Gi ) | i ≠ j and 1 ≤ i ≤ n} http://www.w3.org/TR/sparql11-update/#formalModel
  • 19. Insert Ground Triples PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX ns: <http://example.org/ns#> INSERT DATA { GRAPH <http://example/bookStore> { <http://example/book1> ns:price 42 . <http://example/book1> dc:creator “A.N.Other” . } } Data before: # Graph: http://example/bookStore @prefix dc: <http://purl.org/dc/elements/1.1/> . <http://example/book1> dc:title "Fundamentals of Compiler Design" . Data after: # Graph: http://example/bookStore @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix ns: <http://example.org/ns#> . <http://example/book1> dc:title "Fundamentals of Compiler Design" . <http://example/book1> ns:price 42 . <http://example/book1> dc:creator “A.N.Other” 42 . http://www.w3.org/TR/sparql11-update/
  • 20. Delete-Insert PREFIX foaf: http://xmlns.com/foaf/0.1/ WITH <http://example/addresses> DELETE { ?person foaf:givenName 'Bill' } INSERT { ?person foaf:givenName 'William' } WHERE { ?person foaf:givenName 'Bill’ } Data before: Data after: # Graph: http://example/addresses # Graph: http://example/addresses @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . <http://example/president25> foaf:givenName "Bill" . <http://example/president25> foaf:givenName "William" . <http://example/president25> foaf:familyName "McKinley" .<http://example/president25> foaf:familyName "McKinley" . <http://example/president27> foaf:givenName "Bill" . <http://example/president27> foaf:givenName "William" . <http://example/president27> foaf:familyName "Taft" . <http://example/president27> foaf:familyName "Taft" . <http://example/president42> foaf:givenName "Bill" . <http://example/president42> foaf:givenName "William" . <http://example/president42> foaf:familyName "Clinton" . <http://example/president42> foaf:familyName "Clinton" .
  • 21. Enabling Intentions Preservation in LLD Observed effect of SPARQL update queries at generation time should be preserved at re-execution time… If an Sparql query inserts a triple at generation time, this triple will be added at all sites whatever the state of the store at re-execution time… Intentions can be impossible to preserve (nondeterministic operations) or impossible without more order constraints (re-execution time evaluation) PREFIX foaf: http://xmlns.com/foaf/0.1/ WITH <http://example/addresses> DELETE { ?person foaf:givenName 'Bill' } INSERT { ?person foaf:givenName 'William' } WHERE { ?person foaf:givenName 'Bill’ } PREFIX dc: <http://purl.org/dc/elements/1.1/> INSERT DATA { _:book1 dc:title "A new book" ; dc:creator "A.N.Other" . }
  • 22. A CRDT for SPARQL update Rewrite SPARQL UPDATE Operations to another type that does commute, while preserving the original semantics. Graph Update operations work over RDF-Graphs, and can be expressed in terms of the basic INSERT and DELETE of ground triples. RDF-Graphs are Sets of RDF-Triples, and CRDTs for Sets with Insert and Delete already exists…
  • 23. SU-Set : A Sparql-Update CRDT Same id for all triples inserted together, unicity is preserved Less expensive in Communication Need to send (triple,id) when deleting More expensive in communication Better for LLD as knowledge grows…
  • 25. Implementation SU-Set embedded into Corese, a reference implementation of RDF-Stores and SPARQL Update by Wimmics team. Rewrite Sparql Update into SUSet, execute and log. “Follow your change” implemented with anti-entropy over logs and version vectors.
  • 26. How much we need to pay to have eventual consistency ? Time Overhead : Adding an id to each element is linear. Selection and lookup is not affected by many pairs with the same triple. Round and # of messages Overhead : Convergence after one round, one message per operation → Optimal
  • 27. Validation – Space Overhead 32 bytes per 1 billion triples = 32 GB → 1 Ipod Two UUIDs, 16 bytes each Semantic Stores already use an internal id → Reuse it (UUID1,UUID2) Extra pairs produced by concurrent insertions could cause problems... Counter Site identifier A version vector for each (monotonically (Useful for authoring too) site: Max size= number of increasing) participants (~300-800).
  • 28. Dynamicity: DBPedia Live Framework to register changes in DBPedia in near real-time. Generates one file with inserted triples and one with deleted triples approximately each 10 seconds. No pattern operations logged  No Overhead here. Many more insertions than deletions Good for SU-Set. Many triples inserted per operation Even better for SU-Set
  • 29. SU-Set Communication overhead on DBPedia Live 7 days of streaming No concurrent insertions IdSize1: 0,096 Kb Avg triple size1: 0,155 Kb Size (MB) # of Triples 21957 Inserts 21762190 3294,08 5334,29 3296,6 21957 Deletes 1755888 265,78 164,61 430,4 54,47% 4,68% Overhead 1Measured Without ids 1 id per triple 1 id per operation Operation with ‘wc –c’ over DBPedia Live log files
  • 30. Conclusion Live Linked Data makes linked data “writable” and allows a new query paradigm. SU-Set is a CRDT for RDF-Graphs updated with SPARQL-Update 1.1 that ensure eventual consistency (convergence + intentions) on Live Linked Data at a reasonable cost.
  • 31. Future Work Finish and release LLD-Corese. Write the composed CRDT for RDF-Datasets What other benefits LLD has for: Querying (SyncAndSearch vs Federations, connection with LAV/GUN, with Adaptivity…?) Discovery…? Community Detection…? Provenance…? Traces

Notes de l'éditeur

  1. More Over ,Linked data cloud has been growing steadily, for example, here we see the 2007 picture of the Linked Open Data.
  2. To this big picture in 2011
  3. Linked data main rule is that your data needs to reference or “link” with existing knowledge in other sources, for example, LinkedMDB relates its entity of the movie “The Shining” to DBPedia, Yago or GeoNames.
  4. Let’s see this applied to our previous example:Now the connections are streams that I subscribe, and from where I pull the information I’m interested in,(Like this…)Notice that with this I can solve the issue of freshness, as whenever someone updates its data, I can immediately react (example given…)And I also ease the performance issue, as I will do local processing.As in the other approaches, I can solve the data curation issue by incorporating a source and locally editing (example given…)This curation also generates an operation stream that I publish and advertise to others, people will start to follow my changes and I would also solve the write enabling issue.
  5. So, our expectacion is that other data publishers will also start to open its updates, favoring the creation of mashups by the means of the “Follow your change” relation, creating another network on top of the actual Linked Data cloud, Of course this new mashups are not static, they can perform adding-value operations, and if this value is high enough, the original publisher can also start following these changes and, voilà, we have the writing that we want.
  6. Before jumping into that, we need to document us a little about what is RDF and what are they operations
  7. SPARQL Operations have a very well defined formal semantics, one set of operations to operate graphs (all here but the last two) and other to operate on datasets
  8. Read and then…For example, pattern operations, are state dependant, so I cannot preserve its intention, same thing with blank nodes, that are like, existential variables, and SPARQL specification leaves to each store the how to name them.
  9. To implement the pattern operations, we chose to calculate the affected triples at the local state, and send that to the other pairs.
  10. Read…
  11. The main interest of Linked Data is to perform queries on multiple datasets, to make that, and assuming we already discovered the relevant datasets, the first way to proceed is to do a local copy of them and perform the query on it. For example, if I ask what is the largest city of Vicente Fernandez country, by asking the country to MusicBrainz and the largest city to wikipedia, but, if the original datasets get modified, I lose freshness.
  12. On the other hand, instead of bringing the data to me, I can go to it, making the query in a distributed fashion. The drawback is that I depend on the reliability and performance of the endpoints, , if one is down, my query will fail.
  13. So, how does this look on real Live Linked Data: We already mentioned that DBPedia has start to publishing its updates: so, th
  14. So an issue that we can raise here is: what happens when I found a mistake in a dataset that is not mine, where I don’t know how or don’t have the rights to modify. Put it in a more general way, how do I make this Linked Data network “Writable”.
  15. So, we know from CRDT theory that if all operations commute, the object is conflict-free… but, bas