stream processing engine

Stream Processing Engine

王岩
2011-12-8

Agenda
• Architecture
• Multi-thread
• I/O
• Further work

Purpose
• Users can query provenance Why to query
provenance?
When the user get
a strange result, he
stream1 may be interested
in the provenance
stream2
Stream processing engine result
stream3
stream4
Provenance : the data in the original coming streams
from which the result is generated
Required : If the provenance of one tuple hasn’t been saved
in the disk , the tuple would not be sent to the user.

Architecture
Layered architecture

Spe layer Stream processing
engine

Buffer for
Buffer layer
provenance

File layer Disk I/O
The layer below will provides service to the layer above
The layer above will invoke the interface provided by the layer below

Store metadata of streams
Spe layer
including cql statements
and datatypes Parser the cql statement Component view
and generate query plan
tree

Metadata Cql parser

Process along the query Provide common service
plan tree

Query plan
Utility
processor

Query Plan Tree Entity
root root operator

join join operator

join leaf

join leaf

select leaf

select operator
leaf leaf operator

Operator class diagram

id

leaf select join root

Query Plan Tree Entity
root

Transportation
join
queue

join leaf

leaf
join Common queue

select leaf

Storage queue
leaf

Queue class diagram
Data flow

attribute1: integer
attribute2: integer
attribute3: string

Queue Entity
Memory management
Continuous memory
used as buffer Head : the head of the tuples in the buffer
In a queue, we don’t allocate memory for each tuple ,
we allocate memory for the queue, buffer tuples
Tail : the tail of the tuples in the and the
would be saved in the buffer of the queue.

tail head head
tail head tail head

When there is no space for
the new tuple , throw
When a tuple leaves, the exception: need load
When initialed, head the
When a tuple arrives, and shedding algorithm
tail tail will move a tuple
head would move forward
forward
are the beginning
the length of tuple
the length of a
address of the buffer

Tuple Entity
If the tuple is in a queue, it will use the buffer of
the queue.
If not, it will create its own buffer

The beginning address of the buffer
The offset in the buffer
The map saves the provenance
of the tuple The tuple length
Map[“s1”]=list{“id1”,”id2”}
Map[“s2”]=list{“id4”} The timestamp of the tuple
The relation schema with the tuple

Buffer layer Façade design pattern
Singleton design pattern
Spe layer

BufferControl

Buffer layer

File layer

The buffer control class provides an interface of the layer.
The upper layer needn’t to interact with other classes in the buffer layer.
If we change the implementation of the buffer layer, we needn’t to change the
code of the layer above as long as we maintain the same interface.

Provenance life cycle

BufferControl

Buffer layer

File layer

ThisWhenscantuplemapbeeneach tuple pushedwillwouldat sometime. memory.
Thereprovenance insertatstored,functionpushedqueryqueue.the
Itqueuethebetuplemayprocessed along in toquery provenance
The Calla storage queue, system the provenancetheforstoredbe stored.
IfCall the sometime the tosee storing it isinqueue be ato thein
Atisthe store may atheifthe tuples outputstored
The tuplethe toBeStoredsystem,functiondeletememory
The Call the arrivesa be for call the
Insertthemake a copythe what provenance to the client
the will thread the provenance tuple shouldstream
Another the function functiontuple thebe the been saved.
provenance has call see the the
Then And anotherandmapquerymemoryto be functiontree
willprovenanceis thread deleted from have file plan
isStored into
id on will
transportation
It Whenthe storing function.
will call the tuple reaches the root operator

Page
Continuous memory
May be 4kb, 16kb, …. 56kb
page In this system, pages are used to
save two kinds of objects.

tuple
Page for tuples
page
tuple
tuple Why to use
Page for bitmaps bitmap? saved , saved。 for
Because we should save the state
each tuple, not
bitm bitm bitm Markup the state of We are able to use just use 1 bit for
ap
bitm
ap
bitm
ap
bitm the tuple one tuple if we use bitmap.

page
ap
bitm
ap
ap
bitm
ap
ap
bitm
ap
Just thinking about a stream about
10kb/s, each tuple is 8 bytes, then we
can save
bitm
ap
bitm
ap
bitm
ap
0:not saved 10*1024/8*0.875=1.12kb/s
1:saved

Architecture for buffer layer
Global list<Page*>unused page page page page page page page

buffer
list<Page*>used page page page page page

buffer List<Page*> page
*
page
*
page
* List<Page*> page
*
page
*

vector page
*
page
*
page
* tuple
bitmap
page
*
page
*

Hash table s1 s2 s1 s2

Eachglobal would save the
The name of stream hash
Each vector buffer that all
The buffer for this
pointers ofhashed to a be save the
would needed pages that
pages be the should
table
vector from here.
allocated
data of one stream.

Architecture for buffer layer tuple o(1)
Insert

buffer
list<Page*>used page page page page page page

*
page
*
page
*
Page
* List<Page*> page
*
page
*

vector page
*
page
*
Page
*
page
*
page
*
page
*

Hash table s2 s1
s1
Thethis case,Just seejust page can onlys2save 10
Secondly ,it buffer each move a page from
In global will suppose that page in the
because if the last
vector itis Page theupbytes pages, sobitmap
The buffer that 100 the 20 tuple. there
thespace pagelist ato save table would
Suppose ofissamein the buffer If
hash
tuples, the two pagesadded toused list, tablethe
Firstly,thewillspacejust save the hash and is
Then unusedlookto with idthe21. stream1
have : tuple from so it will
allocate withthe idfrom: thebytes now page
Andthe tuple will be inserted into this
no it for the vector for of the buffer
with global and
the page of inserted into this
yes,toTuple oftuple is 2110 stream1
find the this page
vector. Thean
tuple
stream1 comes
return
allocate a page from the buffer
page.
Then a page can store 10 tuples.

Buffer layer Sequence diagram
for inserting a tuple

: BufferControl PageHashTable PageVector ProvenanceBuff StreamBuffer
er
ifStreamExist(streamName)
true
getPage(int id)
getInsertablePage(int id)
getMorePage()
getOnePageToUse()

page

page
page
page

push(data)

Architecture for buffer layertuple o(1)
Find

buffer

*
page
*
page
* List<Page*> page
*
page
*

vector page
*
page
*
page
*
page
*
page
*

As a result it is inWe just calculatevector with index of 1, and
the page of 45-31/10=1
offset of 4 in theThe tupleThen page isto found
Suppose Suppose that the we havebytes thetuple. It is
It is the index of the tuple 100 find
now westream1 isin thebytes the tuple
page. of want 10 vector
the we calculate 45-31-10*1=4
And
It is the offsetId identifier31 45
with sameofwithpage in the of
The the of the the bitmap
first the vector is vector

Release the memory
If we don’t release the
root
memory used for saving
provenance ,the memory
would run out quickly
join
We don’t release memory
for one tuple each time ,
join leaf we just release memory
for one page each time.
join leaf
We will look into every
identifier of the provenance in
select leaf the query plan tree. These
identifiers are considered
useful. And others are useless.
Then the page contains no
leaf useful tuples would be
deleted.

Architecture for buffer Delete tuples o(nmp)
layer
Global list<Page*>unused page page page page page page page page

buffer

*
page
*
page
* List<Page*> page
*
page
*

vector page
*
page
*
page
*
page
*
page
*

For releasingshould do is,we scan along the query plan
Then we the memory firstflush thestream1 are:move
What wethe page from the buffer andof pagevector
just page the and
And deleteweknowthe useful identify ofvector. Update the
tree, it from found the
Suppose that one page is 100 bytes
and
containsof the vector. It 14the same the unused listit.
the used page list to will release
first id no useful tuples, wewith the bitmap
13, is ,16
The tuple of stream1 is 10 bytes
And the first id of the vector is 1.

User query for provenance
Data flow diagram

When user query for
When user query for
When user don’t query
provenance , and the
provenance , and the
for provenance
provenance is not in the
provenanceis in the pages in
pages the memory
in the memory

Architecture for buffer layer
Query provenance
list<Page*>unused page page page page page page page page


page page page page page page page page
List<Page*> * * * List<Page*> * * * * *

page page page page page
* * * query
tuple * *
page page page
* * *

s1 s2 s1 s2

Fornot,buffer query if the provenancetheinthe at for read
InIfexample will willdisk here ,LFU: putin withbufferidentify 5 and
If theifwe ,we seefull ,welast pageget is buffer willused.
We willwe must read the pageis in the are for query,
When see for query,the mustfrom onethe page for
not,buffer the provenance set the frequently of
we if is flush provenancethere page out
If the data from thebeinthen least page We most
tuples,theyes, we will findwedata at onedisk. to thepage.
can
read strategy may31 the provenance in this beginning
The
pagesstream1. time.
one pageof thehere
of buffer.
query.

Buffer layer
Abstract factory design pattern

The client code needn’t to
know the implementation
details of the tuple and
bitmap and query.

Multi-threads
• main tread : do most of things including
receiving data from streams.
• Storing tread : save provenance
• I/O thread : deal with I/O with clients
including registering streams, registering cqls,
query provenance.

Read-write lock
Lock Insert(thread 1)
datastructur Provenance hashtable vector buffer globalbuffer page
e Map
type map map vector list List Unsigned thread-
char [] unsafe
Lock logic Initialed
read
~read
Write
initialed
~write
tuple
read
~read
read
~read
write
initailed
~write
Write
~write
Write
~write
Write
write

Read-write lock
Lock To be
stored(thread 1)

datastructur Provenance hashtable vector buffer globalbuffer page
e Map
type map map vector list List Unsigned
char []
Lock logic Write

~write

Read-write lock
Lock Is tuple stored(thread 1)
datastructu Provenance hashtable vector buffer globalbuffe page
re Map r
char []
Lock logic read
~read
Read bitmap
~read
Read
~read

Read-write lock
Lock delete(thread 1)
re Map r
char []
Lock logic Read
~read
Read
~read tuple
read
~read
Read
Write
released
~write
~read
Write
~write

Read-write lock
Lock storing(thread 2)
re Map r
char []
Lock logic Write
~write
Read
~read tuple
Read
~read
Read
~read
Read
~read
Read bitmap
~read

trywrite
~write

Read-write lock
Lock query(thread 3)
re Map r
char []
Lock logic read
~read
Read
~read tuple
tryread
~read
Read
~read
Read
~read
write
query
Initialed
~write
Write
~write

Lock optimization
• We should reduce the cost of lock
management while increase the concurrency
• The lock for buffer is useless because all
threads would make no conflicts on it. We can
get rid of it.
• The lock for global buffer can be changed to a
mutex.
• Some not important operations can just do
trylock and trywrite.

Lock performance analysis
For the read-write lock we used:
• allowing concurrent access to multiple threads for reading
• restricting access to a single thread for writes
• write-preferring
• The smallest granularity : Page
Performance lost:
• When we need to do some operations on one page.
• Page for tuple : reader—storing thread, query thread
writer—main thread
• Page for bitmap :reader—main thread
writer—storing thread
• Page for query : all done in the I/O thread
Conclusion:
• Likely to improve performance while needs experiments

File layer Write a tuple

• When write a tuple into the file
• Get the offset of the tail of the file
• Append the tuple on the tail of the file
• Flush the buffer
• Add the offset and tuple identifier to the index
• Use partitioned hash to implement the two-
dimensional index.

file disk

Registering
I/O
streams
Registering stream1
cqls
stream2
System
stream3
Query
provenance
stream4

We don’t use one thread for one I/O. We just implement it in
We implement them in one thread. the main thread.
It can be blocked when there is no need to Must be non-blocking I/O
read or write
We will use I/O Multiplexing here.

What is I/O multiplexing?
• When an application needs to handle
multiple I/O descriptors at the same time

• When I/O on any one descriptor can
result in blocking

• It can be blocked until any of the I/O
descriptors registered becomes able to
read, write or throw exception.

epoll
• epoll is a scalable I/O event notification
mechanism
• It is meant to replace the older POSIX select
and poll system calls.

File descriptor: Fd=0 Fd=1 Fd=2 fd=3 Fd=4

Write: 0 0 0 1 1 select
Read: 0 0 1 1 0

Further work
• Implement the multi-threads design, use a
thread to save the provenance
• Implement the file layer design. Add an index to
the provenance saved in the file
• Implement the I/O design

stream processing engine

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (14)

En vedette

En vedette (10)

Similaire à stream processing engine

Similaire à stream processing engine (20)

Dernier

Dernier (20)

stream processing engine