4. The Learning Problem
• Given a composite event CE
• Given a set of historical traces
– Positive traces: CE occurs at the end of the trace
– Negative traces: CE does not occur in the trace
• Derive a rule that describes the causal relation
between:
– A pattern of primitive events
– The occurrence of CE
6. Rule Languages
Oracle CEPOracle CEP
Microsoft
Stream Insight
Microsoft
Stream Insight
StreamStream
CayugaCayuga
IBM WSBEIBM WSBE
Stream MillStream Mill
AuroraAurora
BorealisBorealis
SASE+SASE+
PadresPadres
EsperEsper
Telegraph
CQ
Telegraph
CQ
NextCEPNextCEP
TESLATESLA
ETALISETALIS
TIBCO Business
Events
TIBCO Business
Events
Progress
Apama
Progress
Apama
7. CEP Operators
Define FIRE:
within 5 min
{ Smoke(area = $a) and Temp(value>40, area = $a)
and not Rain (mm>2, area = $a) }
where { Temp -> Smoke }
SelectionSelectionCombinationCombination
NegationNegation SequenceSequence
WindowWindow
ParameterParameter
AggregatesAggregates
8. Solution Strategy
• Modular architecture
– Ad-hoc learning components for each operator
– Easy to modify/replace a component
• Possibly with hints from domain experts
– Easy to add new types of operators
ParameterParameterSelectionSelection
CombinationCombination
NegationNegation
SequenceSequence
WindowWindow
AggregatesAggregates
9. Learning Algorithm
• Key idea:
– Each operator defines a set of constraints
• E.g., the selection operator defines:
– Which event types must appear
– Which attribute values they must include
– A positive trace satisfies all the constraints in a
rule (for each operator)
– We can learn the constraints in a rule by
intersecting the constraints satisfied in each
positive trace
10. Learning Algorithm
• Rule: A and B must occur
ZBA CE
Y XA B CE
X
YA B CE
1
2
1
3
A
B
1
2 3
ZX
YZYW W
K K
11. Learning Algorithm
• Rule: A and B must occur
ZBA CE
Y XA B CE
X
YA B CE
1
2
1
3
A
B
1
2 3
X
YZYW W
K K
Z
Z
• What we learn can be a superset of the actual
constraints
• Limited impact in practice
12. Machine Learning
• Our initial prototype relied on supervised
machine learning algorithms and tools
• Lessons learned
– Some operators (e.g., parameters) were difficult to
encode
• Need to explicitly allocate one variable for each possible
constraint
• Space explosion
– (Significantly) higher execution time
– Lower precision
• Intersection prevents this!
13. iCEP
• One module for each operator
• Filtering architecture
– Positive traces are “cleaned” at each step
– Pruning “unrequired” elements
Window
(Win) Learner
Events/Attributes
(Ev) Learner
Constraints
(Constr) Learner
Aggregates
(Aggr) Learner
Parameters
(Param) Learner
Sequences (Seq)
Learner
Negations (Neg)
Learner
Negative Traces
PositiveTraces
15. Events and Attributes Learner
• Assumes the size of the evaluation window is
known
• Extracts the set of relevant event types and
attributes
– By intersecting the types and attributes that
appear in all positive traces
• Only selected types and attributes are
considered in the following modules
19. Events and Window Learners
• In absence of domain knowledge about event types and
window …
• … Events and Win learners work together iteratively
– Increasing the size of the window
– Computing the set of relevant types at each step
– The process stops when the number of relevant types stabilizes
21. Constraints and Aggregates Learners
• Extract constraints on the value of attributes
– Of individual events
– Of aggregations
• E.g., Maximum, Average value
• Users can specify a set of aggregation functions
22. Constraints and Aggregates Learners
1. Equality constraints
– Learn by intersection
• The same value appears in all positive traces
1. Inequality constraints (≠,<,>) for numeric
attributes
– Unknown relations / operators
• Min and Max values appearing in all positive traces
– Known relations / operators (from users)
• Learning algorithm base on Support Vector Machines
24. Parameters and Sequences Learners
• Learn by intersection
– Parameters constraints satisfied by all positive
traces
• Both equality and inequality relations
– Ordering constraints satisfied by all positive traces
26. Negation Learner
• Only component that looks into negative
traces
– Selects traces that satisfy all the constraints
identified so far
– Extracts common elements in such traces
• These elements may constribute to prevent the
occurrence of the composite event
• They will be negated in the derived rule
29. Synthetic Workload
Number of Event Types 25
Distribution of Types Uniform
Number of Attributes per Event 3
Number of Constraints per Event 3
Average Window Size 10s
Average Distance Between Events 1s
Number of Parameter Constraints 0
Number of Sequence Constraints 0
Number of Aggregate Constraints 0
Number of Negation Constraints 0
Number of Positive Traces 1000
Recall Precision
0.98 0.94
33. Presence of Negations
• We learn negation by intersection
– Looking at “common” elements in negative traces
• Multiple negations
– One negated element is sufficient for preventing the
occurrence of CE
– They are not possible to detect by intersection
34. Real Data
• Traffic monitoring system for public
transportation
• Rules to detect: delays from multiple bus lines
in a small time window
• Noisy data
– Not only exceptional events (delays) …
– … but also continuous operational information
from each and every bus line
35. Real Data
• Results in terms of precision and recall are
confimed
• Derived rules are noisy
– Include frequent events present in every trace
• A cleaning step could improve the quality of
rules
36. Conclusions - Lessons Learned
• First approach to automated rule generation
• Large solution space
– Many parameters to consider
• Difficult to encode in traditional machine learning
algorithms
• Modular approach
– Improved performance
– Improved accuracy
– Easier to add/replace single modules
• Integration with hints from domain experts
37. Future Work
• Address open problems
– Multiple negations
– Composite events that could be triggered by
multiple patterns (disjunction)
• Integrate additional operators
– E.g., detection of trends
• Develop techniques for rule cleaning
– Remove “noise”
Real time processing of continuous flows of data
Generated by sources at unpredictable rates
To produce new higher level knowledge in the form of composite events or situations of interests and deliver it to connected sinks
According to a set of deployed rules that specify how to filter, select, and combine incoming data, based on their content and their timing relationships
Focus on fast algorithms. Widely used in High Frequency Trading and partly in Computer Systems Monitoring
Problem of language for expressing rules. Worked on that in the past. Today focus on a new problem: even with a good language, how to write a good rule? Understand and express the correct causal relationships between primitive events and composite ones
Look at the past occurrences of composite events and see if we can extract some useful information
We consider a model in which events are identified by a type, a set of attribute/value pairs, and a single timestamp.
We consider a history of past occurrences of primitive and composite events.
We want to infer the rule that describes the causal relationship between primitive and composite events.
Which are the operators we can use to define our rule?
System from tool vendors: IBM – Oracle – Microsoft – TIBCO (The leading one, together with Progress Apama)
Also a (partially) open source solution: Esper
Research proposals: Stream Mill (UCLA) – Stream (Stanford) – NextCEP (Imperial)
Different languages: real-time analysis (relational operations on a stream) vs. situation detection (from patterns)
We focus mainly on the second type of languages and we identify a (sub)set of common operators to focus on
Each operator defines a set of constraints: for example, the selection operator defines which events must appear in a trace and which attribute values they must include
Each operator defines a set of constraints: for example, the selection operator defines which events must appear in a trace and which attribute values they must include
Parameters and sequences learners work again by intersecting all the constraints appearing in positive traces
We consider equality parameters and ordering constraints
Underconstraining reduces precision
Errors in the detection of a constraint are emphasized when there are only a few constraints.
Increasing the window size also increases the number of events to consider.
This reduces the procesion, but the impact is minimal.