WSO2Con US 2013 - View, Act, and React: Shaping Business Activity with Analytics, BigData Queries, and Complex Event Processing
1. View,
Act,
and
React:
Shaping
Business
Ac6vity
with
Analy6cs,
BigData
Queries,
and
Complex
Event
Processing
Srinath
Perera
WSO2
Director,
Research
2. • 1942,
Asimov
wrote
a
book
called
Founda6on,
in
which
the
character
Hari
Seldon
use
mathema6cal
models
to
predict
the
future
of
civiliza6on
and
then
to
save
it.
• Paul
Krugman,(
the
Nobel
Laureate
in
Economics),
said
his
interest
in
economic
begin
with
founda6on.
• We
are
entering
that
Era
of
our
history
where
Mr.
Asimov
might
have
a
point.
Start
Image
cedit,
CC
licence,
hWp://ansem315.deviantart.com/art/
Asimov-‐Founda6on-‐395188263
3. Consider
a
day
in
your
life
• What
is
the
best
road
to
take?
• Would
there
be
any
bad
weather?
• What
is
the
best
way
to
invest
the
money?
• Should
I
take
that
loan?
• Can
I
op6mize
my
day?
• Is
there
a
way
to
do
this
faster?
• What
have
others
done
in
similar
cases?
• Which
product
should
I
buy?
6. Why
it
is
hard?
• System
build
of
many
computers
(1000
nodes
to
store
1PB
with
1TB
each)
• That
handles
lots
of
data
(10Gb
=>
83
days
to
copy
1PB)
• Running
complex
logic
(models
can
be
complex
as
the
system)
• This
pushes
us
to
the
fron6er
of
Distributed
Systems
and
Databases
hWp://www.flickr.com/photos/mariachily/5250487136,
Licensed
CC
8. Each
stream
has
a
name
{
'name':'PlayStream',
'version':'1.0.0',
'payloadData':[
'name':'sid',
'ts':'BIGINT',
'x':'DOUBLE',
...
]
}
Each
event
has
a-ributes,
that
has
types
Event
Streams
We
view
the
world
as
event
streams
Event
stream
is
series
of
events
over
6me
We
use
SQL
like
languages
(Hive/
CEP)
to
process
event
streams
and
create
new
event
streams
Select from PlayStream[x>2500 and .. ]
İnsert into NearGoalStream
9. Demo
Usecase
(DEBS
2013)
• Football
game,
players
and
ball
has
sensors
(DESB
Challenge
2013)
sid,
ts,
x,y,z,
v,a
• Use
cases:
Running
analysis,
Ball
Possession
and
Shots
on
Goal,
Heatmap
of
Ac6vity
• Siddhi
did
100K+
on
each
usecase
• For
this
talk,
we
will
look
at
user
ac6vity
by
region
of
the
field.
13. BAM
Hive
Query
Find
how
much
6me
spent
in
each
cell.
CREATE EXTERNAL TABLE IF NOT EXISTS PlayStream …
select sid,
ceiling((y+33000)*7/10000 + x/10000) as cell,
count(sid)
from PlayStream
GROUP BY sid, ceiling((y+33000)*7/10000 + x/10000);
15. CEP
Query
Calculate
the
mean
loca6on
of
each
player
every
second
define partition sidPrt by PlayStream.sid, LocBySecStream.sid
from PlayStream#window.timeBatch(1sec)
select sid, avg(x) as xMean, avg(y) as yMean, avg(z) as zMean
insert into LocBySecStream partition by sidPrt
from every e1 = LocBySecStream ->
e2 = LocBySecStream [e1.yMean + 10000 > yMean
or yMean + 10000 > e1.yMean]
Detect
more
than
within 2sec select e1.sid
10m
run
insert into LongAdvStream partition by sidPrt ;