The presentation of my talk at WU Vienna on 18/2/2016. I discuss the problem of unifying existing solutions to process semantic streams - with a particular focus on the ones that perform continuous query answering over RDF streams
On Unified Stream Reasoning - The RDF Stream Processing realm
1. Daniele Dell’Aglio
On Unified Stream Reasoning
The RDF Stream Processing realm
Daniele Dell’Aglio
WU Vienna, 18/02/2016
2. Daniele Dell’Aglio
Problem setting
Real time integration of huge volumes of dynamic data from
heterogeneous sources
– Traffic Prediction
– Social media analytics
– Personalised services
2
3. Daniele Dell’Aglio
Stream Reasoning
Stream Reasoning (SR): inference over streams of data
– Stream and Event Processing: real-time processing of
highly dynamic data
• Aggregations, filters
• Complex event detection
– Reasoning
• Access and integration of heterogeneous data
• Make explicit hidden information
3
5. Daniele Dell’Aglio
The initial problem (1)
Where are Alice and Bob,
when they are together?
Let’s consider a tumbling
window W(ω=β=5)
Let’s execute the
experiment 4 times
Execution 1° answer 2° answer
1 :hall [6] :kitchen [11]
2 :hall [5] :kitchen [10]
3 :hall [6] :kitchen [11]
4 - [7] - [12]
S1 S2 S3 S4S
t3 6 91
{:alice :isIn :hall}
{:bob :isIn :hall}
{:alice :isIn :kitchen}
{:bob :isIn :kitchen}
Which is the correct answer?
width
slide
5
6. Daniele Dell’Aglio
The initial problem (2)
System 1 System 2
Which system behaves in the correct way?
Execution 1° answer 2° answer
1 :hall [6] :kitchen [11]
2 :hall [5] :kitchen [10]
3 :hall [6] :kitchen [11]
4 - [7] - [12]
Execution 1° answer 2° answer
1 :hall [3] :kitchen [9]
2 No answers
3 :hall [3] :kitchen [9]
4 No answers
S1 S2 S3 S4S
t3 6 91
{:bob :isIn :hall} {:bob :isIn :kitchen}
{:alice :isIn :hall} {:alice :isIn :kitchen}
6
7. Daniele Dell’Aglio
Problem
How to unify current Stream Reasoning techniques?
Why do we need it?
• Comparison and contrast
• Interoperability
• Study RDF Stream Processing related problems
• Standard RSP query language
7
8. Daniele Dell’Aglio
Streams Ontology
Background
data
Entailment
Regimes
RSEP-QL
Applications
RSP-QL
BGP evaluation
over streams BGP evaluation
over BKG
Event Pattern
detection operators
Model to express
continous queries
The entailment regimes
require an ontology and
provide more answers w.r.t.
Both RSP-QL and RSEP-QL
Not part of the today talk!
Contribution – RSEP-QL
A comprehensive model that formally defines the semantics of
RDF Stream Processing engines
8
9. Daniele Dell’Aglio
Q
(E, DS, QF)
From SPARQL…
Evaluator
Data layer
Result
Formatter
Ans(Q)RDF graphs
E
DS
QF
Query
Interface
9
10. Daniele Dell’Aglio
Q
(E, DS, QF)
…to RSEP-QL
Evaluator
Data layer
Result
Formatter
Ans(Q)RDF graphs
E
DS
QF
Continuous
EvaluatorET
RDF graphs
RDF streams
Query
Interface
SDS
Q
(E, SDS, QF)
Q
(E, SDS, ET, QF)
Q
(SE, SDS, ET, QF)
SE
10
17. Daniele Dell’Aglio
From SPARQL dataset to RSEP-QL Streaming Dataset
t1
G(t1)
T⊆ ℕ R={RDF graph}
SPARQL dataset
G
H
Instantaneous Graph
G(t1) RTime-Varying Graph
G: T R
RSP-QL dataset
S3
S4 S5
S6
S7
S8
S9 S10
S11
S12
S
S1
S2
𝕎(S)
17
18. Daniele Dell’Aglio
Evaluation
The SPARQL evaluation function is defined as
⟦𝑃⟧ 𝐷𝑆(𝐺)
The RSEP-QL evaluation function extends the SPARQL one by
introducing the evaluation time instant
⟦𝑃⟧ 𝑆𝐷𝑆(𝐴)
𝑡
SPARQL operators are straight extended to the new evaluation
function
Example: JOIN
⟦𝐽𝑂𝐼𝑁(𝑃1, 𝑃2)⟧ 𝑆𝐷𝑆 𝐴
𝑡
= ⟦𝑃1⟧ 𝑆𝐷𝑆 𝐴
𝑡
⨝ ⟦𝑃2⟧ 𝑆𝐷𝑆 𝐴
𝑡
18
19. Daniele Dell’Aglio
Instantaneous evaluation
The main difference is on the BGP evaluation:
⟦𝐵𝐺𝑃⟧ 𝑆𝐷𝑆(𝐴)
𝑡
=⟦𝐵𝐺𝑃⟧ 𝑆𝐷𝑆(𝐴,𝑡)
SDS(A,t) is:
SDS(G,t)= SDS(G(t)) if A is a time-varying graph G
SDS(𝕎(S),t)=SDS(m(𝕎(S,t))) if A is from a sliding window 𝕎
SDS(𝕃(S),t)=SDS(m(𝕃(S,t))) if A is from a landmark window 𝕃
where m denotes a merge function
m(𝕎(S,t))= 𝑑 𝑖,𝑡 𝑖 ∈𝕎(S,t) 𝑑𝑖
– takes as input a window content i.e. a sequence of timestamped
RDF graphs
– produces an RDF graph
19
20. Daniele Dell’Aglio
Continuous evaluation
For each evaluation time t ∈ ET: ⟦𝑆𝐸⟧ 𝑆𝐷𝑆(𝐴)
𝑡
– The continuous evaluation is a sequence of instantaneous
evaluations
It is not always possible to compute ET a priori
– Can be data dependent
– ET is expressed through a Report Policy
A Report Policy is a set of conditions to one or more window
operators in SDS
– Initially defined in SECRET for Stream Processing engines
20
21. Daniele Dell’Aglio
Continuous evaluation – Report Policies
Report Policy examples:
– P Periodic: the window reports only at regular intervals
– WC Window Close: the window reports if the active
window closes
– CC Content Change: the window reports if the content
changes.
21
22. Daniele Dell’Aglio
Event Processing – Basic Event Pattern
Support to Complex Event Processing operators
The minimal element is the Basic Event Pattern:
EVENT 𝑤 𝑃
Intuitively, the Basic Graph Pattern 𝑃 should match against one
stream item of the window identified by 𝑤
BEP can be combined through complex operators
• SEQ, LAST, EVERY
Example:
EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2
22
23. Daniele Dell’Aglio
Event Processing – Evaluation semantics
Formally, we use a new evaluation function ⦅⋅⦆ 𝑜,𝑐
𝑡
• t is the evaluation time instant,
• 𝑜, 𝑐 is an additional window to identify the portion of the
data on which the event may happen
Event pattern evaluation produces event mappings 𝜇, 𝑡1, 𝑡2
• 𝜇 is a solution mapping
• 𝑡1 and 𝑡2 denote the time inverval justifying 𝜇
23
24. Daniele Dell’Aglio
Event Processing – Evaluation semantics - Examples
The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is
24
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
EVENT 𝑤1 𝑃1
SEQ
EVERY EVENT 𝑤2 𝑃2
t
10 12 14 1611 13 15
25. Daniele Dell’Aglio
Event Processing – Evaluation semantics - Examples
The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is
25
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
EVENT 𝑤1 𝑃1
SEQ
EVERY EVENT 𝑤2 𝑃2
t
10 12 14 1611 13 15
26. Daniele Dell’Aglio
Event Processing – Evaluation semantics - Examples
The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is
26
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
S1 S10
EVENT 𝑤1 𝑃1
SEQ
EVERY EVENT 𝑤2 𝑃2
t
10 12 14 1611 13 15
11 13
27. Daniele Dell’Aglio
Event Processing – Evaluation semantics - Examples
The evaluation of EVENT 𝑤1 𝑃1 SEQ EVERY EVENT 𝑤2 𝑃2 is
27
S2
S3 S4
S1 S1
S6
S7
S8
S9 S10
S11
S12
S2
S1 S10
S1 S12
EVENT 𝑤1 𝑃1
SEQ
EVERY EVENT 𝑤2 𝑃2
t
10 12 14 1611 13 15
11 13
11 15
28. Daniele Dell’Aglio
Event Processing – MATCH graph pattern
Event patterns are eclosed in MATCH graph patterns
• Event mappings exist only in the context of event patterns
• The evaluation of a MATCH graph pattern produces a bag of
solution mappings
𝑀𝐴𝑇𝐶𝐻 𝐸 𝑆𝐷𝑆 𝐴
𝑡
= {𝜇| 𝜇, 𝑡1, 𝑡2 ∈ ⦅𝐸⦆ 0,𝑡
𝑡
}
It is possible to combine the MATCH graph pattern with other
SPARQL graph patterns
28
32. Daniele Dell’Aglio
What’s next?
An RSEP-QL query language
• W3C RSP CG ongoing activities
Implementations
• Yet another RSP engine
• Framework to let existing RSP engine interoperate
Streams are getting popular – applications want more and more
sophisticated features
• Different timestamps, out-of-orders
• Inductive reasoning to cope with noise
• Permanent storage of portions of data (raw or inferred)
32
33. Daniele Dell’Aglio
Conclusions
The dynamics introduced in the continuous query evlauation
process have not been totally understood
• Not fully captured by existing models
• RSEP-QL captures those dynamics
• All of them? Let’s discover it!
We need to push implementations and applications on use cases
• To understand which helpful operators are missing
• To find new unexpected behaviours
33
34. Daniele Dell’Aglio
People I am grateful to...
Emanuele Della Valle
and:
Marco Balduini
Jean-Paul Calbimonte
Oscar Corcho
Minh Dao-Trao
Danh Le Phuoc
Freddy Lecue
34
35. Daniele Dell’Aglio
... without forgetting you!
Thank you! Questions?
On Unified Stream Reasoning
The RDF Stream Processing realm
Daniele Dell’Aglio
daniele.dellaglio@polimi.it
http://dellaglio.org
35