1. PlanetData: Consuming Structured
Data at Web Scale
Elena Simperl, Barry Norton, Karlsruhe Institute of Technology
1st International Symposium on Data-driven Process Discovery and Analysis
June 30, 2011, Campione d’Italia, Italy
2. PlanetData‘s Aim and Objectives
Aim: establish an interdisciplinary,
sustainable European community on
large-scale data management
◦ Purposeful data exposure
Databases
◦ Novel and improved applications
Data and
Semantics Web
Mining
• Objectives
◦ Addressing challenges through integrated research
◦ Data and technology provisioning through PlanetData Lab
◦ Impact through training, dissemination, standardization
and networking
◦ Openness and flexibility through PlanetData Programs
3. Work Plan Highlights
Methods and techniques to publish, access and manage stream-
like data
Quality assessment of interlinked data sets, including best
practices for the representation and usage of spatio-temporal
information
Provenance and access control framework for Linked (Stream)
Data
Data sets and vocabularies, including best practices for
publishing and managing self-descriptive data
Linked Services and Processes as an instrument to develop
applications
Yearly summer school co-located with the Extended Semantic
Web Conference
Semantic Web video journal
PlanetData Programs
8. Linked Data Cloud
Taken together Linked Data is said to form
a ‘cloud’ of shared references and
vocabularies
(growing on a weekly basis)
9. Linked Data Principles
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up
those names.
3. When someone looks up a URI, provide useful
information, using the standards (RDF,
SPARQL)
4. Include links to other URIs, so that they can
discover more things.
Bring together semantic technologies and the
Web architecture
Applied to other types of data as well: stream-
like, multimedia…
11. Services Over Linked Data
A problem can be seen in the
current Linked Data sphere
when it comes to
services/APIs/functionalities
The standards are often not
then used
The results of service
interaction do not
contribute to the Linked
Data cloud
Developers have to work
with heterogeneous
representations RDF
12. RDF Services at the BBC
This is not a problem of scale, efficiency
or speed
RDF-based
communication
efficiently
realised using
memcached
04.08.201 Real-time updates to a large
0
(ferocious) audience
13. Linked Open Services
Aim to promote services over Linked Data
bringing together:
RESTful services (respecting Web
architecture)
◦ Resource-oriented
◦ Manipulated with HTTP verbs
GET, PUT (, PATCH), POST, DELETE
◦ Negotiate representations
Linked Data
◦ Uniform use of URIs
◦ Use of RDF and SPARQL
14. Linked Services: Principles
Concretely, Linked Open Services come with a
set of guiding principles:
1. Describe services as LOD prosumers
with input and output descriptions as SPARQL graph
patterns
2. Communicate RDF by RESTful content negotiation
3. Communicate and describe the knowledge
contribution resulting from service interaction,
including implicit knowledge relating input, output and
service provider
Associated with the last principle is an optional
fourth:
4. When wrapping non-LOS services, extend the (lifted,
if non-RDF) message to make explicit the implicit
knowledge, and to use Linked Data vocabularies, using
SPARQL CONSTRUCT queries
http://www.linkedopenservices.org/blog/?page_id=2
16. Linked Processes: Principles
In order to compose Linked Services we are
not specific about the style, except that RDF
must be stored and forwarded
Principles:
◦ Decide control flow conditions based on SPARQL
ASK queries
◦ Base iteration on SPARQL SELECT queries
◦ Define dataflow/mediation based on SPARQL
CONSTRUCT queries
In this way compositions, ‘mash-up’s, etc.,
also use the languages/technologies most
familiar to the Linked Data community
17. LOP Media Monitoring Process
A Social Media Manager is required to monitor
(micro)blogging sites and respond to negative comments:
10.08.2011
18. Composition Service 1
A service may monitor the ‘Twittersphere’ for tweets with a
given tag
Harvest
Input: {?t a sioc_t:Tag; rdfs:label ?l}
Output: {?p a sioc_t:MicroblogPost;
sioc:topic ?t;
sioc:has_creator ?m;
sioc:content ?c .
OPTIONAL {?p sioc:addressed_to ?a}}
10.08.2011
19. Composition Service 2
A sentiment analysis service may annotate (micro)blog posts
according to, e.g., the Human Emotion Ontology
AnalyseSentiment
Input: {?p a sioc:Post; sioc:content ?c}
Output: {?e a heo:Emotion;
heo:hasManifestationInMedia ?p;
heo:hasCategory ?c}
10.08.2011
20. Composition Service 3
A human service selects among possible combinations of
these and optionally raises a response
ManageMicroblog
Input: {?p a sioc_t:MicroblogPost;
sioc:has_creator ?m.
?e heo:hasManifestationInMedia ?p.
{?e heo:hasCategory heo:anger UNION
?e heo:hasCategory heo:disgust}}
Output: {OPTIONAL {?r a sioc_t:MicroblogPost;
sioc:addressed_to ?m}}
10.08.2011
22. http://www.planet-data.eu
Join PlanetData
Associate partners have
Access to open training infrastructure
Early access to ongoing PD results through
participation in PlanetData meetings
Opportunity to shape the results and topics of the
PD Programs through contribution of
requirements and use cases
PlanetData Programs call in 2012