2. Agenda
• Architecture
• Multi-thread
• I/O
• Further work
3. Purpose
• Users can query provenance Why to query
provenance?
When the user get
a strange result, he
stream1 may be interested
in the provenance
stream2
Stream processing engine result
stream3
stream4
Provenance : the data in the original coming streams
from which the result is generated
Required : If the provenance of one tuple hasn’t been saved
in the disk , the tuple would not be sent to the user.
4. Architecture
Layered architecture
Spe layer Stream processing
engine
Buffer for
Buffer layer
provenance
File layer Disk I/O
The layer below will provides service to the layer above
The layer above will invoke the interface provided by the layer below
5. Store metadata of streams
Spe layer
including cql statements
and datatypes Parser the cql statement Component view
and generate query plan
tree
Metadata Cql parser
Process along the query Provide common service
plan tree
Query plan
Utility
processor
8. Query Plan Tree Entity
root
Transportation
join
queue
join leaf
leaf
join Common queue
select leaf
Storage queue
leaf
9. Queue class diagram
Data flow
attribute1: integer
attribute2: integer
attribute3: string
10. Queue Entity
Memory management
Continuous memory
used as buffer Head : the head of the tuples in the buffer
In a queue, we don’t allocate memory for each tuple ,
we allocate memory for the queue, buffer tuples
Tail : the tail of the tuples in the and the
would be saved in the buffer of the queue.
tail head head
tail head tail head
When there is no space for
the new tuple , throw
When a tuple leaves, the exception: need load
When initialed, head the
When a tuple arrives, and shedding algorithm
tail tail will move a tuple
head would move forward
forward
are the beginning
the length of tuple
the length of a
address of the buffer
11. Tuple Entity
If the tuple is in a queue, it will use the buffer of
the queue.
If not, it will create its own buffer
The beginning address of the buffer
The offset in the buffer
The map saves the provenance
of the tuple The tuple length
Map[“s1”]=list{“id1”,”id2”}
Map[“s2”]=list{“id4”} The timestamp of the tuple
The relation schema with the tuple
12. Buffer layer Façade design pattern
Singleton design pattern
Spe layer
BufferControl
Buffer layer
File layer
The buffer control class provides an interface of the layer.
The upper layer needn’t to interact with other classes in the buffer layer.
If we change the implementation of the buffer layer, we needn’t to change the
code of the layer above as long as we maintain the same interface.
13. Provenance life cycle
BufferControl
Buffer layer
File layer
ThisWhenscantuplemapbeeneach tuple pushedwillwouldat sometime. memory.
Thereprovenance insertatstored,functionpushedqueryqueue.the
Itqueuethebetuplemayprocessed along in toquery provenance
The Calla storage queue, system the provenancetheforstoredbe stored.
IfCall the sometime the tosee storing it isinqueue be ato thein
Atisthe store may atheifthe tuples outputstored
The tuplethe toBeStoredsystem,functiondeletememory
The Call the arrivesa be for call the
Insertthemake a copythe what provenance to the client
the will thread the provenance tuple shouldstream
Another the function functiontuple thebe the been saved.
provenance has call see the the
Then And anotherandmapquerymemoryto be functiontree
willprovenanceis thread deleted from have file plan
isStored into
id on will
transportation
It Whenthe storing function.
will call the tuple reaches the root operator
14. Page
Continuous memory
May be 4kb, 16kb, …. 56kb
page In this system, pages are used to
save two kinds of objects.
tuple
Page for tuples
page
tuple
tuple Why to use
Page for bitmaps bitmap? saved , saved。 for
Because we should save the state
each tuple, not
bitm bitm bitm Markup the state of We are able to use just use 1 bit for
ap
bitm
ap
bitm
ap
bitm the tuple one tuple if we use bitmap.
page
ap
bitm
ap
ap
bitm
ap
ap
bitm
ap
Just thinking about a stream about
10kb/s, each tuple is 8 bytes, then we
can save
bitm
ap
bitm
ap
bitm
ap
0:not saved 10*1024/8*0.875=1.12kb/s
1:saved
15. Architecture for buffer layer
Global list<Page*>unused page page page page page page page
buffer
list<Page*>used page page page page page
buffer List<Page*> page
*
page
*
page
* List<Page*> page
*
page
*
vector page
*
page
*
page
* tuple
bitmap
page
*
page
*
Hash table s1 s2 s1 s2
Eachglobal would save the
The name of stream hash
Each vector buffer that all
The buffer for this
pointers ofhashed to a be save the
would needed pages that
pages be the should
table
vector from here.
allocated
data of one stream.
16. Architecture for buffer layer tuple o(1)
Insert
Global list<Page*>unused page page page page page page page
buffer
list<Page*>used page page page page page page
buffer List<Page*> page
*
page
*
page
*
Page
* List<Page*> page
*
page
*
vector page
*
page
*
Page
*
page
*
page
*
page
*
Hash table s2 s1
s1
Thethis case,Just seejust page can onlys2save 10
Secondly ,it buffer each move a page from
In global will suppose that page in the
because if the last
vector itis Page theupbytes pages, sobitmap
The buffer that 100 the 20 tuple. there
thespace pagelist ato save table would
Suppose ofissamein the buffer If
hash
tuples, the two pagesadded toused list, tablethe
Firstly,thewillspacejust save the hash and is
Then unusedlookto with idthe21. stream1
have : tuple from so it will
allocate withthe idfrom: thebytes now page
Andthe tuple will be inserted into this
no it for the vector for of the buffer
with global and
the page of inserted into this
yes,toTuple oftuple is 2110 stream1
find the this page
vector. Thean
tuple
stream1 comes
return
allocate a page from the buffer
page.
Then a page can store 10 tuples.
17. Buffer layer Sequence diagram
for inserting a tuple
: BufferControl PageHashTable PageVector ProvenanceBuff StreamBuffer
er
ifStreamExist(streamName)
true
getPage(int id)
getInsertablePage(int id)
getMorePage()
getOnePageToUse()
page
page
page
page
push(data)
18. Architecture for buffer layertuple o(1)
Find
Global list<Page*>unused page page page page page page page
buffer
list<Page*>used page page page page page
buffer List<Page*> page
*
page
*
page
* List<Page*> page
*
page
*
vector page
*
page
*
page
*
page
*
page
*
Hash table s1 s2 s1 s2
As a result it is inWe just calculatevector with index of 1, and
the page of 45-31/10=1
offset of 4 in theThe tupleThen page isto found
Suppose Suppose that the we havebytes thetuple. It is
It is the index of the tuple 100 find
now westream1 isin thebytes the tuple
page. of want 10 vector
the we calculate 45-31-10*1=4
And
It is the offsetId identifier31 45
with sameofwithpage in the of
The the of the the bitmap
first the vector is vector
19. Release the memory
If we don’t release the
root
memory used for saving
provenance ,the memory
would run out quickly
join
We don’t release memory
for one tuple each time ,
join leaf we just release memory
for one page each time.
join leaf
We will look into every
identifier of the provenance in
select leaf the query plan tree. These
identifiers are considered
useful. And others are useless.
Then the page contains no
leaf useful tuples would be
deleted.
20. Architecture for buffer Delete tuples o(nmp)
layer
Global list<Page*>unused page page page page page page page page
buffer
list<Page*>used page page page page page
buffer List<Page*> page
*
page
*
page
* List<Page*> page
*
page
*
vector page
*
page
*
page
*
page
*
page
*
Hash table s1 s2 s1 s2
For releasingshould do is,we scan along the query plan
Then we the memory firstflush thestream1 are:move
What wethe page from the buffer andof pagevector
just page the and
And deleteweknowthe useful identify ofvector. Update the
tree, it from found the
Suppose that one page is 100 bytes
and
containsof the vector. It 14the same the unused listit.
the used page list to will release
first id no useful tuples, wewith the bitmap
13, is ,16
The tuple of stream1 is 10 bytes
And the first id of the vector is 1.
21. User query for provenance
Data flow diagram
When user query for
When user query for
When user don’t query
provenance , and the
provenance , and the
for provenance
provenance is not in the
provenanceis in the pages in
pages the memory
in the memory
22. Architecture for buffer layer
Query provenance
list<Page*>unused page page page page page page page page
list<Page*>used page page page page page
page page page page page page page page
List<Page*> * * * List<Page*> * * * * *
page page page page page
* * * query
tuple * *
page page page
* * *
s1 s2 s1 s2
Fornot,buffer query if the provenancetheinthe at for read
InIfexample will willdisk here ,LFU: putin withbufferidentify 5 and
If theifwe ,we seefull ,welast pageget is buffer willused.
We willwe must read the pageis in the are for query,
When see for query,the mustfrom onethe page for
not,buffer the provenance set the frequently of
we if is flush provenancethere page out
If the data from thebeinthen least page We most
tuples,theyes, we will findwedata at onedisk. to thepage.
can
read strategy may31 the provenance in this beginning
The
pagesstream1. time.
one pageof thehere
of buffer.
query.
23. Buffer layer
Abstract factory design pattern
The client code needn’t to
know the implementation
details of the tuple and
bitmap and query.
24. Multi-threads
• main tread : do most of things including
receiving data from streams.
• Storing tread : save provenance
• I/O thread : deal with I/O with clients
including registering streams, registering cqls,
query provenance.
26. Read-write lock
Lock To be
stored(thread 1)
datastructur Provenance hashtable vector buffer globalbuffer page
e Map
type map map vector list List Unsigned
char []
Lock logic Write
~write
27. Read-write lock
Lock Is tuple stored(thread 1)
datastructu Provenance hashtable vector buffer globalbuffe page
re Map r
type map map vector list List Unsigned
char []
Lock logic read
~read
Read bitmap
~read
Read
~read
28. Read-write lock
Lock delete(thread 1)
datastructu Provenance hashtable vector buffer globalbuffe page
re Map r
type map map vector list List Unsigned
char []
Lock logic Read
~read
Read
~read tuple
read
~read
Read
Write
released
~write
~read
Write
~write
29. Read-write lock
Lock storing(thread 2)
datastructu Provenance hashtable vector buffer globalbuffe page
re Map r
type map map vector list List Unsigned
char []
Lock logic Write
~write
Read
~read tuple
Read
~read
Read
~read
Read
~read
Read bitmap
~read
trywrite
~write
30. Read-write lock
Lock query(thread 3)
datastructu Provenance hashtable vector buffer globalbuffe page
re Map r
type map map vector list List Unsigned
char []
Lock logic read
~read
Read
~read tuple
tryread
~read
Read
~read
Read
~read
write
query
Initialed
~write
Write
~write
31. Lock optimization
• We should reduce the cost of lock
management while increase the concurrency
• The lock for buffer is useless because all
threads would make no conflicts on it. We can
get rid of it.
• The lock for global buffer can be changed to a
mutex.
• Some not important operations can just do
trylock and trywrite.
32. Lock performance analysis
For the read-write lock we used:
• allowing concurrent access to multiple threads for reading
• restricting access to a single thread for writes
• write-preferring
• The smallest granularity : Page
Performance lost:
• When we need to do some operations on one page.
• Page for tuple : reader—storing thread, query thread
writer—main thread
• Page for bitmap :reader—main thread
writer—storing thread
• Page for query : all done in the I/O thread
Conclusion:
• Likely to improve performance while needs experiments
34. File layer Write a tuple
• When write a tuple into the file
• Get the offset of the tail of the file
• Append the tuple on the tail of the file
• Flush the buffer
• Add the offset and tuple identifier to the index
• Use partitioned hash to implement the two-
dimensional index.
file disk
35. Registering
I/O
streams
Registering stream1
cqls
stream2
System
stream3
Query
provenance
stream4
We don’t use one thread for one I/O. We just implement it in
We implement them in one thread. the main thread.
It can be blocked when there is no need to Must be non-blocking I/O
read or write
We will use I/O Multiplexing here.
36. What is I/O multiplexing?
• When an application needs to handle
multiple I/O descriptors at the same time
• When I/O on any one descriptor can
result in blocking
• It can be blocked until any of the I/O
descriptors registered becomes able to
read, write or throw exception.
37. epoll
• epoll is a scalable I/O event notification
mechanism
• It is meant to replace the older POSIX select
and poll system calls.
File descriptor: Fd=0 Fd=1 Fd=2 fd=3 Fd=4
Write: 0 0 0 1 1 select
Read: 0 0 1 1 0
38. Further work
• Implement the multi-threads design, use a
thread to save the provenance
• Implement the file layer design. Add an index to
the provenance saved in the file
• Implement the I/O design