SlideShare a Scribd company logo
1 of 56
14/08/13 Rui Vieira, MSc ITEC 1
Efficient top-k query processing on distributed column family databases
Efficient top-k query
processing on distributed
column family databases
14/08/13 Rui Vieira, MSc ITEC 2
Efficient top-k query processing on distributed column family databases
Ranking (top-k) queriesRanking (top-k) queries
We use top-k queries everydayWe use top-k queries everyday
● Search engines (top 100 pages for certain words)
● Analytics applications (most visited pages per day)
Text search: Time periods:
14/08/13 Rui Vieira, MSc ITEC 3
Efficient top-k query processing on distributed column family databases
Ranking (top-k) queriesRanking (top-k) queries
DefinitionDefinition
Find all k objects with the highest aggregated score over function f
(f is usually a summation function over attributes)
Example:
Find the top 10 students with highest
grades over all modules.
...
Module n
...
Module 2
John, 89%
Emma, 88%
Brian, 70%
Steve, 65%
Anna, 60%
Peter, 59%
Paul, 50%
Mary, 49%
Richard, 31%
...
Module 1
...
John, 39%
Emma, 48%
Brian, 50%
Steve, 75%
Anna, 50%
Peter, 59%
Paul, 80%
Mary, 89%
Richard, 91%
John, 82%
Emma, 78%
Brian, 90%
Steve, 85%
Anna, 83%
Peter, 81%
Paul, 70%
Mary, 59%
Richard, 51%
14/08/13 Rui Vieira, MSc ITEC 4
Efficient top-k query processing on distributed column family databases
Motivation: real-time distributed top-k queriesMotivation: real-time distributed top-k queries
Why real-time top-k queries?
• To be integrated in a larger real-time analytics platform
● “User” real-time = hundred milliseconds ~ one second
• Implement solutions make efficient use of:
• Memory, Bandwidth and Computations
• Can handle massive amounts of data
Use case:
We logging page views in a website. Can we find the top 10 most
visited in the last 7 days? What about 10 months? All under 1 second?
14/08/13 Rui Vieira, MSc ITEC 5
Efficient top-k query processing on distributed column family databases
Top-k queries: simplistic solutionTop-k queries: simplistic solution
“Naive” method
• Fetch all objects and scores from all sources
• Aggregate them in memory
• Sort all aggregations
• Select top-k highest scoring
Solutions to provide ranking queries answers (but not real-time):
<O 1 , 1000> <O
89 , 900>
<O
99 , 1>
...peer 1
Query
Coordinator
peer
2
...
peer n
merge all
data
aggregate
scores
sort all
aggregated
select
k highest
Not feasible:
• For large amounts of data
• Possibly doesn't fit in RAM
• Execution time most likely not real-time
• Not efficient: low-scoring objects processed
• Due to all of the above: not scalable
14/08/13 Rui Vieira, MSc ITEC 6
Efficient top-k query processing on distributed column family databases
Top-k queries: Batch solutionsTop-k queries: Batch solutions
Batch operations (Hadoop / Map-Reduce)
Pros
• Proven solution to (some) top-k scenarios
• Excellent for “report” style use cases
Cons
• Still has to process all the information
• Not real-time
14/08/13 Rui Vieira, MSc ITEC 7
Efficient top-k query processing on distributed column family databases
Our requirements
● Work with “Peers” which are distributed logically (rows)
as well as physically (nodes)
● Nodes in the cluster have (very) limited instructions
● Low latency (fixed number of round-trips)
● Offer considerable savings of bandwidth and execution time
● Possible to adapt to data access patterns and models in Cassandra
14/08/13 Rui Vieira, MSc ITEC 8
Efficient top-k query processing on distributed column family databases
Algorithms
14/08/13 Rui Vieira, MSc ITEC 9
Efficient top-k query processing on distributed column family databases
Algorithms: related Work
Threshold family of algorithms pioneered by Faggins et al.
Objective: determine a threshold below which an object cannot be
a top-k object
Initial Threshold Algorithms (TA) however:
• Not designed with distributed data sources in mind
• Performance highly dependent on data shape (skewness, correlation ...)
• Unbounded round-trips to data source → unbounded latency
• TA keeps performing random accesses until it reaches a
stopping point
14/08/13 Rui Vieira, MSc ITEC 10
Efficient top-k query processing on distributed column family databases
Algorithms: Related Work
Three algorithms were selected:
• Three-Phase Uniform Threshold (TPUT)
• Distributed fixed round-trip exact algorithm
• Hybrid Threshold
• Distributed fixed round-trip exact algorithm
• KLEE
• Distributed fixed round-trip approximate algorithm
• However these algorithms were developed for P2P networks
• As far as we know, they have never been implemented with
distributed column-family databases previously
14/08/13 Rui Vieira, MSc ITEC 11
Efficient top-k query processing on distributed column family databases
Algorithms: TPUT
Request top-k
From each peer
peer1
peer2
peer3
peer4
peerm
calculate a
Partial sum
select kth
score
As min-k
Request all objects with score⩾
mink
m
re-calculate a
Partial sum
select kth
score
as threshold
Request all objects
with score > threshold
Partial sum
(missing scores = 0)
Partial sum
(missing scores = min-k/m)
worst-score best-score
Best-score > worst-score = candidate
Request candidates
peer1
peer2
peer3
peer4
peerm
peer1
peer2
peer3
peer4
peerm
peer1
peer2
peer3
peer4
peerm
Final partial
sum
K highest
are top-k
14/08/13 Rui Vieira, MSc ITEC 12
Efficient top-k query processing on distributed column family databases
Algorithms: Hybrid Threshold
Phase 1
Same as in TPUT.
i.e., the objective is to determine the first threshold: T =
mink
m
score⩾Ti =max(Slowest ,T )
Send to each peer candidates
So far and T
peer1
peer2
peer3
peer4
peerm
Each peer determines lowest scoring
candidate and returns candidates with
Phase 2
Phase 3
re-calculate a
Partial sum
select kth
score
as τ2
If T i<
τ2
m
peer
Fetch score >
τ2
m
re-calculate a
Partial sum
select kth
score
as τ3
Candidates = partial sum > τ3
14/08/13 Rui Vieira, MSc ITEC 13
Efficient top-k query processing on distributed column family databases
Algorithms: KLEE
• TPUT variant
• Trade-off between accuracy and bandwidth
• Relies on summary data (statistical meta-data)
to better estimate min-k without going “deep” on index lists
Fundamental data structures for meta-data:
• Histograms
• Bloom filters
14/08/13 Rui Vieira, MSc ITEC 14
Efficient top-k query processing on distributed column family databases
Algorithms: KLEE (Histograms)
● Equi-width cells
● Configurable number of cells
● Each cell n stores:
● Highest score in n (ub)
● Lowest score in n (lb)
● Average score for n (avg)
● Number of objects in n (freq)
Example:
Cell #10 (covers scores from 900-1000):
● ub = 989
● lb = 901
● avg = 937.4
● freq = 200
14/08/13 Rui Vieira, MSc ITEC 15
Efficient top-k query processing on distributed column family databases
Algorithms: KLEE (Bloom filters)
00 1 2 3 4 5 6 7 ... m
0 1 0 0 1 0 0 0 0 1
h 1 (O) h 2 (O)h n (O)
h 1 (P) h 2 (P) h n (P) ∴ P ∉ S
● Bit set with objects hashed into positions
● Allows for very fast membership queries
● Space-efficient data structure
● However, not isomorphic → cannot determine objects from Bloom filter alone
14/08/13 Rui Vieira, MSc ITEC 16
Efficient top-k query processing on distributed column family databases
Algorithms: KLEE
Consists of 4 or (optionally) 3 steps
1 - Exploration Step
Approximate a min-k threshold based on statistical meta-data
2 - Optimisation Step
Decide whether execute step 3 or directly 4
3 - Candidate Filtering
Filter high-scoring candidates
4 - Candidate Retrieval
Fetch candidates from peers
14/08/13 Rui Vieira, MSc ITEC 17
Efficient top-k query processing on distributed column family databases
Algorithms: KLEE (Phase 1)
Fetch top-k objects
Fetch c “top” histograms + Bloom filters
Fetch c “low” freq and avg
peer1
peer2
peer3
peer4
peerm
For each object
seen so far
Is object in
Bloom filter?
Use weighted avg
Of low cells
Use corresponding
avg value
no
yes
noWas in top-k?
Partial sum
select kth
score
As min-k
score>
mink
m
candidates
14/08/13 Rui Vieira, MSc ITEC 18
Efficient top-k query processing on distributed column family databases
Algorithms: KLEE (Phase 3)
● Request a bit set with all objects scoring higher than
● Perform a statistical pruning leaving only the most “common”
objects
(Note: this step was not implement due to the computational
limitation of Cassandra nodes)
14/08/13 Rui Vieira, MSc ITEC 19
Efficient top-k query processing on distributed column family databases
Algorithms: KLEE (Phase 4)
● Request all the candidates from the peers
● Perform a partial sum with the true scores of objects
● Select the k highest as our top-k
14/08/13 Rui Vieira, MSc ITEC 20
Efficient top-k query processing on distributed column family databases
CassandraCassandra
14/08/13 Rui Vieira, MSc ITEC 21
Efficient top-k query processing on distributed column family databases
Cassandra (architecture overview)Cassandra (architecture overview)
● Fully decentralised column-family store
● High (almost linear) scalability
● No single point of failure (no “master” or “slave” nodes)
● Automatic replication
● Clients can read and write to any node in cluster
● Cassandra takes over duties of partitioning and replicating automatically
14/08/13 Rui Vieira, MSc ITEC 22
Efficient top-k query processing on distributed column family databases
Cassandra (architecture overview)Cassandra (architecture overview)
● Automatic partitioning of data (commonly used is Random partitioning)
●
Rows are distributed in nodes by hash of partition key (1st
PK)
"2013-08-14"
id = O 1
score = 7919
column
table foo
nodeA
nodeB
nodeC
nodeD
hashing
(MD5) on key
... id = O n
score = 9109
id = O 1
score = 1219
... id = O n
score = 109
id = O 1
score = 59
... id = O n
score = 91
id = O 1
score = 7919
... id = On
score = 9109
id = O 1
score = 1219
... id = On
score = 109
id = O 1
score = 59
... id = On
score = 91
"2013-08-15"
"2013-08-16"
"2013-08-14"
"2013-08-15"
"2013-08-16"
14/08/13 Rui Vieira, MSc ITEC 23
Efficient top-k query processing on distributed column family databases
Cassandra (data model)
● Columns to be ordered upon insertion (ordered by PKs)
● Columns in the same row are physically co-located
● Range searches are fast: score < 10000
(simply a linear seek on disk)
"2013-08-16"
id = O 1
score = 7919
id = O 2
score = 7901
column
Comparator is id (ascending)
"2013-08-16" id = O 1
score = 7919
id = O 2
score = 7901
column
Comparator is score (ascending)
table_forward
table_reverse
14/08/13 Rui Vieira, MSc ITEC 24
Efficient top-k query processing on distributed column family databases
Cassandra (CQL)
Data manipulation language for Cassandra is CQL
● Similar in syntax to SQL
INSERT INTO table (foo, bar) VALUES (42, 'Meaning')
SELECT foo, bar FROM table WHERE foo = 42
Limitations
● No joins, unions or sub-selects
● No aggregation functions (min, max, etc...)
● Inequality search are bound to primary key declaration order (next slide)
14/08/13 Rui Vieira, MSc ITEC 25
Efficient top-k query processing on distributed column family databases
Cassandra (CQL)
Consider the following table
CREATE TABLE visits(
date timestamp,
user_id bigint,
hits bigint,
PRIMARY KEY (date, user_id))
Although the following queries would be valid SQL queries
They are not valid CQL:
SELECT * FROM visits WHERE hits > 1000
SELECT * FROM visits WHERE user_id > 900 AND hits = 0
Inequality queries are restricted to PKs and return
contiguous columns, such as
SELECT * FROM visits WHERE date = 1368438171000 AND user_id > 1000
14/08/13 Rui Vieira, MSc ITEC 26
Efficient top-k query processing on distributed column family databases
Implementation
14/08/13 Rui Vieira, MSc ITEC 27
Efficient top-k query processing on distributed column family databases
Implementation (overview)
Query
Coordinator
peer1
peer2
peern
Peer
interface
driver
nodeA
nodeB
nodeC
nodeD
asynchronous call
asynchronous call
asynchronous call
callbackcallbackcallbackasynchronous callcallbackasynchronous callcallbackasynchronous callcallback
KLEE HT TPUT
JVM
14/08/13 Rui Vieira, MSc ITEC 28
Efficient top-k query processing on distributed column family databases
Implementation: challenges
Implement forward and reverse tables to allow lookup by score and id
● Space is cheap
● Space is even cheaper as Cassandra uses in-built data compression
● Space is even cheaper as denormalised data usually compresses better
than normalised data.
● Advantage of scores columns being pre-ordered at the row level
"2013-08-16"
id = O 1
score = 7919
id = O 2
score = 7901
column
Comparator is id (ascending)
"2013-08-16" id = O 1
score = 7919
id = O 2
score = 7901
column
Comparator is score (ascending)
table_forward
table_reverse
14/08/13 Rui Vieira, MSc ITEC 29
Efficient top-k query processing on distributed column family databases
Implementation: challenges
Map algorithmic steps to CQL logic
Decompose tasks
● Single step in algorithm:
(node can execute arbitrary code)
● Multiple step in this implementation:
(we can only communicate with node via CQL)
peeri
Query
Coordinator
select O >
max(T, S lowest )
List of
candidates determines local
lowest scoring, S lowest
T
peeri
Query
Coordinator
T i=
max(T, S lowest )
List of
candidates
determines local
lowest scoring, S lowest
candidates
peeri
fetch > T i
objects
14/08/13 Rui Vieira, MSc ITEC 30
Efficient top-k query processing on distributed column family databases
Implementation: TPUT (phase 1)
• Query Coordinator (QC) asks for top-k list from each peer 1..m invoking Peer async methods
• QC stores a set of all distinct objects received in a concurrent safe collection
• QC calculates a partial sum for each object
using a thread-safe Accumulator data structure.
Lets assume the partial sums are:
[O89
, 1590] , [O73
, 1590], [O1
, 1000],
[O21
, 990], [O12
, 880], [O51
, 780], [O801
, 680]
Calculate the first threshold:
S psum(O)=S peer1
'
(O)+…+S peerm
'
(O)
T =
τ1
m
Si
'
(O)={Si (O) if O hasbeenreturned by node i
0 if otherwise } 1000, O1
900 , O
89
800, O
73
700, O
51
600, O
21
500, O
801
300, O
780
200, O 12
1, O
99
...
peer 1
190, O 1
690, O
89
790, O
73
590, O
51
990, O
21
390, O
801
10, O
780
490, O 12
290, O
99
...
peer 2
580, O 1
7, O
89
380, O
73
780, O
51
480, O
21
680, O
801
280, O
780
880, O 12
180, O
99
...
peer n
Query
Coordinator
fetch top- k
...
inverse table
14/08/13 Rui Vieira, MSc ITEC 31
Efficient top-k query processing on distributed column family databases
Implementation: TPUT (phase 2)
QC issues a requests for all objects with a score > T
from the inverse table (peer.getAbove(T))
With the received objects, recalculates the
partial sum.
(for each Pair → accumulator.add(pair))
Designates the kth
partial sum as
t2 = accumulator.getKthValue(k)
1000, O1
900 , O
89
800, O
73
700, O
51
600, O
21
500, O
801
300, O
780
200, O 12
1, O
99
...
peer 1
190, O 1
690, O
89
790, O
73
590, O
51
990, O
21
390, O
801
10, O
780
490, O 12
290, O
99
...
peer 2
580, O 1
7, O
89
380, O
73
780, O
51
480, O
21
680, O
801
280, O
780
880, O 12
180, O
99
...
peer n
Query
Coordinator
fetch score > T
...
inverse table
14/08/13 Rui Vieira, MSc ITEC 32
Efficient top-k query processing on distributed column family databases
Implementation: TPUT (phase 3)
● Fetch the final candidates from the
forward table.
● Call async Peer methods
● Aggregate scores and nominate k highest
scoring as the top-k
forward table
O 1 , 1000
O
89 , 900
O
73 , 800
O
51 , 700
O
21 , 600
O
801 , 500
O
780 , 300
O 12 , 200
O
99 , 1
...
peer 1
O 1 , 190
O
89 , 690
O
73 , 790
O
51 , 590
O
21 , 990
O
801 , 390
O
780 , 10
O 12 , 490
O
99 , 290
...
peer 2
O 1 , 580
O
89 , 7
O
73 , 380
O
51 , 780
O
21 , 480
O
801 , 680
O
780 , 280
O 12 , 880
O
99 , 180
...
peer n
Query
Coordinator
fetch final candidates
...
14/08/13 Rui Vieira, MSc ITEC 33
Efficient top-k query processing on distributed column family databases
Implementation: challenges
Sequential vs. Random lookups
All algorithms at some point require random
access
Random access much slower than sequential
forward table
1000, O1
900 , O
89
800, O
73
700, O
51
600, O
21
500, O
801
300, O
780
200, O 12
1, O
99
...
peer 1
...
peer1
inverse table
sequential
O1, 1000
O89, 900
O73, 800
O51, 700
O21, 600
O801, 500
O780, 300
O12, 200
O99, 1
"random"
Lookup # objects Time (ms) 95% CI (ms)
Sequential 240 1.70 0.27
Random 240 115.16 1.32
Sample size n = 100
14/08/13 Rui Vieira, MSc ITEC 34
Efficient top-k query processing on distributed column family databases
Implementation: KLEE challenges
Sequential vs. Random lookups
As a consequence of expensive random lookups a modified KLEE3 variant was
implemented
KLEE3-M:
In the final phase, instead of filtering candidates with
Do a range scan per peer for objects with
Trade-off:
score<
mink
m
score⩾
mink
m
data transfer
execution
time
14/08/13 Rui Vieira, MSc ITEC 35
Efficient top-k query processing on distributed column family databases
CREATE TABLE table_metadata(
peer text,
cell int,
lb double,
ub double,
freq bigint,
avg double,
binmax double,
binmin double,
filter blob,
PRIMARY KEY (date,cell)
) WITH CLUSTERING ORDER BY (cell DESC)
Implementation: KLEE challenges
Mapping data structures to Cassandra's data model
Serialised filter = 0x0000000600000002020100f0084263884418154205141c11
14/08/13 Rui Vieira, MSc ITEC 36
Efficient top-k query processing on distributed column family databases
Implementation: KLEE challenges
Mapping data structures to Cassandra's data model
peeri
determine
maximum score
and create n
equi-width bins
fetch entire
row serialise Bloom filter
and save row
Histogram
Creator
cell=0
cell=1
cell=2
cell=3
cell=4
cell=5
cell= n
freq =0
freq =2
freq =0
freq =10
freq =140
freq =986
freq =10234
avg =0
avg =4590.2
avg =0
avg =678.1
avg =230.1
avg =56.7
avg =1.02
partition object
per bin and add
to Bloom Filter
filter0
filter1
filter2
filter3
filter4
filter5
filtern
Flexible:
● Configurable number of bins
● Configurable maximum false positive ratio for filters
14/08/13 Rui Vieira, MSc ITEC 37
Efficient top-k query processing on distributed column family databases
Implementation: KLEE
...row 1
Query
Coordinator
metadata table
...
row n
Peer
Peer
getFullHistAsync cell:0
freq,avg,filter,
...
cell:1
freq,avg,filter
...
cell:2
freq,avg,filter
...
cell:3
freq,avg,filter
...
cell:n
freq,avg,filter
...
...row 1
Query
Coordinator
metadata table
...
row n
Peer
Peer
getPartialHistAsync cell:0
freq,avg,filter,
...
cell:1
freq,avg,filter
...
cell:2
freq,avg,filter
...
cell:3
freq,avg ,filter
...
cell:n
freq,avg ,filter
...
...row 1
Query
Coordinator
inverse table
...
row n
Peer
Peer
getTopKAsync
1000, O1 900, O12 800, O7 700, O18 1, O 145
ResultResultResultResult
ResultResultResultHistoBloom
estimate min-k
> min-k
...row 1
Query
Coordinator
forward table
...
row n
Peer
Peer
getObjectsAsync
O1, 1000 O12, 900O7, 800 O18, 700 O145, 1
ResultResultResultResult
aggregate
14/08/13 Rui Vieira, MSc ITEC 38
Efficient top-k query processing on distributed column family databases
Implementation: KLEE challenges
final HistogramCreator hc =
new CassandraHistogramCreator(tableDefinition);
// Optionally a max false positive ratio can be defined
hc.createHistogramTableSchema();
hc.createHistogramTable(“1998-05-01”, … , “1998-07-26“);
Simple API for Histogram/Bloom tables creation
14/08/13 Rui Vieira, MSc ITEC 39
Efficient top-k query processing on distributed column family databases
Implementation: KLEE challenges
 Fast generation
● Feasible for “on-the-fly” jobs
● Roughly linear with
execution time of 56 ms per
peer with 100,000 elements
14/08/13 Rui Vieira, MSc ITEC 40
Efficient top-k query processing on distributed column family databases
Implementation: asynchronous communication
● Driver used allowed for asynchronous communication
● Extensive use of ListenableFuture
● Allows for highly concurrent access with smaller thread pool
● Allows asynchronous transformations (eg ResultSet to POJO)
public ListenableFuture<ResultList> getAboveAsync(final long value) {
final ResultSetFuture above = session.executeAsync(statement.bind(value));
final Function<ResultSet, ResultList> transformResults = new Function<ResultSet, ResultList>() {
@Override
public ListenableFuture<ResultList> apply(ResultSet rs) {
final ResultList resultList = new ResultList();
final List<Row> rows = rs.all();
for (final Row row : rows) {
resultList.add(
Pair.create(row.getBytes(object.getName()), row.getLong(score.getName()))
);
}
return resultList;
}
};
return Futures.transform(above, transformResults, executor);
}
14/08/13 Rui Vieira, MSc ITEC 41
Efficient top-k query processing on distributed column family databases
Implementation: API
{
"wc98_ids": {
"name": "wc98_ids",
"inverse": "wc98_ids_inverse",
"metadata": "wc98_ids_metadata",
"score": {
"name": "visits",
"type": "bigint"
},
"id": {
"name": "id",
"type": "text"
},
"peer": {
"name": "date",
"type": "text"
}
}
}
JSON declaration of tables and columns
final QueryCoordinator coordinator =
QueryCoordinator.create(KLEE.class,
tableDefinition);
coordinator.setKeys(“1998-05-01”,
… , “1998-07-26”);
final List<Pair> topK = coordinator.getTopK(10);
14/08/13 Rui Vieira, MSc ITEC 42
Efficient top-k query processing on distributed column family databases
Datasets
Test data
14/08/13 Rui Vieira, MSc ITEC 43
Efficient top-k query processing on distributed column family databases
Datasets: Synthetic (Zipf)
Used in literature as a good approximation of “real-world” data
14/08/13 Rui Vieira, MSc ITEC 44
Efficient top-k query processing on distributed column family databases
Datasets: 1998 World Cup Data
● Data in Common Log Format (CLF) from the 1998 World Cup web servers
● IP addresses replaced by unique anonymous id
● Widely used in the literature as “real-world” test data
● Around 1.4 billion entries (approximately 2 million unique visitors)
●
Range from 1st
of May to 26th
of July 1998
● Highly skewed data
14/08/13 Rui Vieira, MSc ITEC 45
Efficient top-k query processing on distributed column family databases
Results
14/08/13 Rui Vieira, MSc ITEC 46
Efficient top-k query processing on distributed column family databases
Results: varying k
14/08/13 Rui Vieira, MSc ITEC 47
Efficient top-k query processing on distributed column family databases
Results: varying number of peers
14/08/13 Rui Vieira, MSc ITEC 48
Efficient top-k query processing on distributed column family databases
Results: Datasets (1998 World Cup Data)
Algorithm Data (KB)
Execution
time (ms)
95% CI (ms)
Precision
(%)
KLEE3 80 319.95 ±8.58 100
KLEE3-M 1271 84.75 ±6.5 100
Hybrid Threshold 14,306 1921.9 ±65.28 100
TPUT 44 141.5 ±7.36 100
Naive
(baseline)
43,572 8514.6 ±61.38 100
Data for 18 peers = daily from 1st
June 1998 to 18th
June 1998
Sample size n = 20
Give me the top 20 visitors from 1st
June to 18th
June
14/08/13 Rui Vieira, MSc ITEC 49
Efficient top-k query processing on distributed column family databases
Implementation: Pre-aggregation
Mix and match keys for aggregation
results
"2013-08" 192.0.43.10192.0.43.11
"2013-08-02" 192.0.43.10
98
192.0.43.11
234
96327404
"2013-09" 192.0.43.10
5398
192.0.43.11
23234
"2013-08-01" 192.0.43.10
98
192.0.43.11
234coordinator
.setKeys(“1998-05”,
“1998-06”,
“1998-07-01”,
“1998-07-02”);
final List<Pair> topK =
coordinator.getTopK(10);
Mix and match keys for aggregation
results
top-k results the same, but computed over 4 peers instead of 63 peers.
14/08/13 Rui Vieira, MSc ITEC 50
Efficient top-k query processing on distributed column family databases
Results: Pre-aggregation
Algorithm
Data transfer (KB) Execution time (ms)
full aggregated savings full aggregated savings
KLEE 20756 633 97% 2412.2 44.3 98%
HT 14404 5894 59% 4842.6 818.6 83%
TPUT 2215 61 97% 1657.1 162.2 90%
14/08/13 Rui Vieira, MSc ITEC 51
Efficient top-k query processing on distributed column family databases
Conclusions
14/08/13 Rui Vieira, MSc ITEC 52
Efficient top-k query processing on distributed column family databases
Conclusions
• TPUT and HT are well suited for real-time top-k queries with
minimal structural changes in the infrastructure.
• Savings of 98% (TPUT) and 77% (HT) in execution time with no
loss of precision
• Savings of 99.9% (TPUT) and 67% (HT) in data transfer also with no
loss of precision
• KLEE3 requires additional changes to infrastructure, but:
• Efficient to create
• Can discard final patch phase for approximate results with configurable
trade-off between precision and data transfer / execution time
• Savings of 99% in execution time and 97% in data transfer
14/08/13 Rui Vieira, MSc ITEC 53
Efficient top-k query processing on distributed column family databases
Conclusions
• Scalability can be addressed with good planning of data models
together with pre-aggregation
• KLEE3 is more resilient to low object correlation (the case in real
• world data)
• TPUT and KLEE3 are resilient to high k variations which could
have further practical implementations
14/08/13 Rui Vieira, MSc ITEC 54
Efficient top-k query processing on distributed column family databases
Future work
Implementing KLEE4
●
Intravert1
is an application server built on top of a Cassandra node
● Based on the vert.x application framework
● Communication is done either in a RESTful way or directly with Java client
● Allows passing code (in several JVM languages such as Groovy, Clojure, etc)
which is executed at the “server side”
● Acting as middleware, it is possible to implement processing
(such as the candidate hash set) remotely and return it to our client
● TPUT and HT already implemented using Intravert
● KLEE4 in progress
1- https://github.com/zznate/intravert-ug
14/08/13 Rui Vieira, MSc ITEC 55
Efficient top-k query processing on distributed column family databases
Acknowledgements
Jonathan Halliday (Red Hat)
For technical expertise, supervision and support
14/08/13 Rui Vieira, MSc ITEC 56
Efficient top-k query processing on distributed column family databases
Questions ?

More Related Content

What's hot

Machine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning GroupMachine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning GroupTill Rohrmann
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLFlink Forward
 
FlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkFlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkTheodoros Vasiloudis
 
Deep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
Deep Dive Into Catalyst: Apache Spark 2.0'S OptimizerDeep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
Deep Dive Into Catalyst: Apache Spark 2.0'S OptimizerSpark Summit
 
Spark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer AgarwalSpark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer AgarwalSpark Summit
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep DiveVasia Kalavri
 
Accumulo Summit 2015: Rya: Optimizations to Support Real Time Graph Queries o...
Accumulo Summit 2015: Rya: Optimizations to Support Real Time Graph Queries o...Accumulo Summit 2015: Rya: Optimizations to Support Real Time Graph Queries o...
Accumulo Summit 2015: Rya: Optimizations to Support Real Time Graph Queries o...Accumulo Summit
 
Distributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta MeetupDistributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta MeetupSri Ambati
 
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Databricks
 
Stream data mining & CluStream framework
Stream data mining & CluStream frameworkStream data mining & CluStream framework
Stream data mining & CluStream frameworkYueshen Xu
 
Dictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLP
Dictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLPDictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLP
Dictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLPSujit Pal
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...Spark Summit
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Databricks
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLSpark Summit
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Spark Summit
 
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...Spark Summit
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsAntonio Severien
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...MLconf
 
Machine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMachine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMartin Zapletal
 

What's hot (20)

Machine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning GroupMachine Learning with Apache Flink at Stockholm Machine Learning Group
Machine Learning with Apache Flink at Stockholm Machine Learning Group
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
 
FlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache FlinkFlinkML: Large Scale Machine Learning with Apache Flink
FlinkML: Large Scale Machine Learning with Apache Flink
 
Deep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
Deep Dive Into Catalyst: Apache Spark 2.0'S OptimizerDeep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
Deep Dive Into Catalyst: Apache Spark 2.0'S Optimizer
 
Spark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer AgarwalSpark Summit EU talk by Sameer Agarwal
Spark Summit EU talk by Sameer Agarwal
 
Apache Flink Deep Dive
Apache Flink Deep DiveApache Flink Deep Dive
Apache Flink Deep Dive
 
Accumulo Summit 2015: Rya: Optimizations to Support Real Time Graph Queries o...
Accumulo Summit 2015: Rya: Optimizations to Support Real Time Graph Queries o...Accumulo Summit 2015: Rya: Optimizations to Support Real Time Graph Queries o...
Accumulo Summit 2015: Rya: Optimizations to Support Real Time Graph Queries o...
 
Distributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta MeetupDistributed GLM with H2O - Atlanta Meetup
Distributed GLM with H2O - Atlanta Meetup
 
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
 
Stream data mining & CluStream framework
Stream data mining & CluStream frameworkStream data mining & CluStream framework
Stream data mining & CluStream framework
 
Dictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLP
Dictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLPDictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLP
Dictionary based Annotation at Scale with Spark, SolrTextTagger and OpenNLP
 
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
A Scalable Hierarchical Clustering Algorithm Using Spark: Spark Summit East t...
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone ML
 
18 Data Streams
18 Data Streams18 Data Streams
18 Data Streams
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
 
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models (DB T...
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
 
Machine learning at Scale with Apache Spark
Machine learning at Scale with Apache SparkMachine learning at Scale with Apache Spark
Machine learning at Scale with Apache Spark
 

Similar to Efficient top-k queries processing in column-family distributed databases

Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Josef Hardi
 
ensembles_emptytemplate_v2
ensembles_emptytemplate_v2ensembles_emptytemplate_v2
ensembles_emptytemplate_v2Shrayes Ramesh
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...HostedbyConfluent
 
Wait-free data structures on embedded multi-core systems
Wait-free data structures on embedded multi-core systemsWait-free data structures on embedded multi-core systems
Wait-free data structures on embedded multi-core systemsMenlo Systems GmbH
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging EnvironmentsPaul Groth
 
ntcir14centre-overview
ntcir14centre-overviewntcir14centre-overview
ntcir14centre-overviewTetsuya Sakai
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Codemotion
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesIan Foster
 
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerCloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerDatabricks
 
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Saliya Ekanayake
 
Update on Trinity System Procurement and Plans
Update on Trinity System Procurement and PlansUpdate on Trinity System Procurement and Plans
Update on Trinity System Procurement and Plansinside-BigData.com
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2Mohit Garg
 
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화NAVER Engineering
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with KerasQuantUniversity
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCAapo Kyrölä
 

Similar to Efficient top-k queries processing in column-family distributed databases (20)

Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!
 
NTCIR15WWW3overview
NTCIR15WWW3overviewNTCIR15WWW3overview
NTCIR15WWW3overview
 
ensembles_emptytemplate_v2
ensembles_emptytemplate_v2ensembles_emptytemplate_v2
ensembles_emptytemplate_v2
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
Wait-free data structures on embedded multi-core systems
Wait-free data structures on embedded multi-core systemsWait-free data structures on embedded multi-core systems
Wait-free data structures on embedded multi-core systems
 
Provenance for Data Munging Environments
Provenance for Data Munging EnvironmentsProvenance for Data Munging Environments
Provenance for Data Munging Environments
 
ntcir14centre-overview
ntcir14centre-overviewntcir14centre-overview
ntcir14centre-overview
 
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
 
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn SchedulerCloud-Native Apache Spark Scheduling with YuniKorn Scheduler
Cloud-Native Apache Spark Scheduling with YuniKorn Scheduler
 
2017 nov reflow sbtb
2017 nov reflow sbtb2017 nov reflow sbtb
2017 nov reflow sbtb
 
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Java Thread and Process Performance for Parallel Machine Learning on Multicor...
Java Thread and Process Performance for Parallel Machine Learning on Multicor...
 
Update on Trinity System Procurement and Plans
Update on Trinity System Procurement and PlansUpdate on Trinity System Procurement and Plans
Update on Trinity System Procurement and Plans
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
 
Deep learning with Keras
Deep learning with KerasDeep learning with Keras
Deep learning with Keras
 
Large-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PCLarge-scale Recommendation Systems on Just a PC
Large-scale Recommendation Systems on Just a PC
 

Recently uploaded

The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

Efficient top-k queries processing in column-family distributed databases

  • 1. 14/08/13 Rui Vieira, MSc ITEC 1 Efficient top-k query processing on distributed column family databases Efficient top-k query processing on distributed column family databases
  • 2. 14/08/13 Rui Vieira, MSc ITEC 2 Efficient top-k query processing on distributed column family databases Ranking (top-k) queriesRanking (top-k) queries We use top-k queries everydayWe use top-k queries everyday ● Search engines (top 100 pages for certain words) ● Analytics applications (most visited pages per day) Text search: Time periods:
  • 3. 14/08/13 Rui Vieira, MSc ITEC 3 Efficient top-k query processing on distributed column family databases Ranking (top-k) queriesRanking (top-k) queries DefinitionDefinition Find all k objects with the highest aggregated score over function f (f is usually a summation function over attributes) Example: Find the top 10 students with highest grades over all modules. ... Module n ... Module 2 John, 89% Emma, 88% Brian, 70% Steve, 65% Anna, 60% Peter, 59% Paul, 50% Mary, 49% Richard, 31% ... Module 1 ... John, 39% Emma, 48% Brian, 50% Steve, 75% Anna, 50% Peter, 59% Paul, 80% Mary, 89% Richard, 91% John, 82% Emma, 78% Brian, 90% Steve, 85% Anna, 83% Peter, 81% Paul, 70% Mary, 59% Richard, 51%
  • 4. 14/08/13 Rui Vieira, MSc ITEC 4 Efficient top-k query processing on distributed column family databases Motivation: real-time distributed top-k queriesMotivation: real-time distributed top-k queries Why real-time top-k queries? • To be integrated in a larger real-time analytics platform ● “User” real-time = hundred milliseconds ~ one second • Implement solutions make efficient use of: • Memory, Bandwidth and Computations • Can handle massive amounts of data Use case: We logging page views in a website. Can we find the top 10 most visited in the last 7 days? What about 10 months? All under 1 second?
  • 5. 14/08/13 Rui Vieira, MSc ITEC 5 Efficient top-k query processing on distributed column family databases Top-k queries: simplistic solutionTop-k queries: simplistic solution “Naive” method • Fetch all objects and scores from all sources • Aggregate them in memory • Sort all aggregations • Select top-k highest scoring Solutions to provide ranking queries answers (but not real-time): <O 1 , 1000> <O 89 , 900> <O 99 , 1> ...peer 1 Query Coordinator peer 2 ... peer n merge all data aggregate scores sort all aggregated select k highest Not feasible: • For large amounts of data • Possibly doesn't fit in RAM • Execution time most likely not real-time • Not efficient: low-scoring objects processed • Due to all of the above: not scalable
  • 6. 14/08/13 Rui Vieira, MSc ITEC 6 Efficient top-k query processing on distributed column family databases Top-k queries: Batch solutionsTop-k queries: Batch solutions Batch operations (Hadoop / Map-Reduce) Pros • Proven solution to (some) top-k scenarios • Excellent for “report” style use cases Cons • Still has to process all the information • Not real-time
  • 7. 14/08/13 Rui Vieira, MSc ITEC 7 Efficient top-k query processing on distributed column family databases Our requirements ● Work with “Peers” which are distributed logically (rows) as well as physically (nodes) ● Nodes in the cluster have (very) limited instructions ● Low latency (fixed number of round-trips) ● Offer considerable savings of bandwidth and execution time ● Possible to adapt to data access patterns and models in Cassandra
  • 8. 14/08/13 Rui Vieira, MSc ITEC 8 Efficient top-k query processing on distributed column family databases Algorithms
  • 9. 14/08/13 Rui Vieira, MSc ITEC 9 Efficient top-k query processing on distributed column family databases Algorithms: related Work Threshold family of algorithms pioneered by Faggins et al. Objective: determine a threshold below which an object cannot be a top-k object Initial Threshold Algorithms (TA) however: • Not designed with distributed data sources in mind • Performance highly dependent on data shape (skewness, correlation ...) • Unbounded round-trips to data source → unbounded latency • TA keeps performing random accesses until it reaches a stopping point
  • 10. 14/08/13 Rui Vieira, MSc ITEC 10 Efficient top-k query processing on distributed column family databases Algorithms: Related Work Three algorithms were selected: • Three-Phase Uniform Threshold (TPUT) • Distributed fixed round-trip exact algorithm • Hybrid Threshold • Distributed fixed round-trip exact algorithm • KLEE • Distributed fixed round-trip approximate algorithm • However these algorithms were developed for P2P networks • As far as we know, they have never been implemented with distributed column-family databases previously
  • 11. 14/08/13 Rui Vieira, MSc ITEC 11 Efficient top-k query processing on distributed column family databases Algorithms: TPUT Request top-k From each peer peer1 peer2 peer3 peer4 peerm calculate a Partial sum select kth score As min-k Request all objects with score⩾ mink m re-calculate a Partial sum select kth score as threshold Request all objects with score > threshold Partial sum (missing scores = 0) Partial sum (missing scores = min-k/m) worst-score best-score Best-score > worst-score = candidate Request candidates peer1 peer2 peer3 peer4 peerm peer1 peer2 peer3 peer4 peerm peer1 peer2 peer3 peer4 peerm Final partial sum K highest are top-k
  • 12. 14/08/13 Rui Vieira, MSc ITEC 12 Efficient top-k query processing on distributed column family databases Algorithms: Hybrid Threshold Phase 1 Same as in TPUT. i.e., the objective is to determine the first threshold: T = mink m score⩾Ti =max(Slowest ,T ) Send to each peer candidates So far and T peer1 peer2 peer3 peer4 peerm Each peer determines lowest scoring candidate and returns candidates with Phase 2 Phase 3 re-calculate a Partial sum select kth score as τ2 If T i< τ2 m peer Fetch score > τ2 m re-calculate a Partial sum select kth score as τ3 Candidates = partial sum > τ3
  • 13. 14/08/13 Rui Vieira, MSc ITEC 13 Efficient top-k query processing on distributed column family databases Algorithms: KLEE • TPUT variant • Trade-off between accuracy and bandwidth • Relies on summary data (statistical meta-data) to better estimate min-k without going “deep” on index lists Fundamental data structures for meta-data: • Histograms • Bloom filters
  • 14. 14/08/13 Rui Vieira, MSc ITEC 14 Efficient top-k query processing on distributed column family databases Algorithms: KLEE (Histograms) ● Equi-width cells ● Configurable number of cells ● Each cell n stores: ● Highest score in n (ub) ● Lowest score in n (lb) ● Average score for n (avg) ● Number of objects in n (freq) Example: Cell #10 (covers scores from 900-1000): ● ub = 989 ● lb = 901 ● avg = 937.4 ● freq = 200
  • 15. 14/08/13 Rui Vieira, MSc ITEC 15 Efficient top-k query processing on distributed column family databases Algorithms: KLEE (Bloom filters) 00 1 2 3 4 5 6 7 ... m 0 1 0 0 1 0 0 0 0 1 h 1 (O) h 2 (O)h n (O) h 1 (P) h 2 (P) h n (P) ∴ P ∉ S ● Bit set with objects hashed into positions ● Allows for very fast membership queries ● Space-efficient data structure ● However, not isomorphic → cannot determine objects from Bloom filter alone
  • 16. 14/08/13 Rui Vieira, MSc ITEC 16 Efficient top-k query processing on distributed column family databases Algorithms: KLEE Consists of 4 or (optionally) 3 steps 1 - Exploration Step Approximate a min-k threshold based on statistical meta-data 2 - Optimisation Step Decide whether execute step 3 or directly 4 3 - Candidate Filtering Filter high-scoring candidates 4 - Candidate Retrieval Fetch candidates from peers
  • 17. 14/08/13 Rui Vieira, MSc ITEC 17 Efficient top-k query processing on distributed column family databases Algorithms: KLEE (Phase 1) Fetch top-k objects Fetch c “top” histograms + Bloom filters Fetch c “low” freq and avg peer1 peer2 peer3 peer4 peerm For each object seen so far Is object in Bloom filter? Use weighted avg Of low cells Use corresponding avg value no yes noWas in top-k? Partial sum select kth score As min-k score> mink m candidates
  • 18. 14/08/13 Rui Vieira, MSc ITEC 18 Efficient top-k query processing on distributed column family databases Algorithms: KLEE (Phase 3) ● Request a bit set with all objects scoring higher than ● Perform a statistical pruning leaving only the most “common” objects (Note: this step was not implement due to the computational limitation of Cassandra nodes)
  • 19. 14/08/13 Rui Vieira, MSc ITEC 19 Efficient top-k query processing on distributed column family databases Algorithms: KLEE (Phase 4) ● Request all the candidates from the peers ● Perform a partial sum with the true scores of objects ● Select the k highest as our top-k
  • 20. 14/08/13 Rui Vieira, MSc ITEC 20 Efficient top-k query processing on distributed column family databases CassandraCassandra
  • 21. 14/08/13 Rui Vieira, MSc ITEC 21 Efficient top-k query processing on distributed column family databases Cassandra (architecture overview)Cassandra (architecture overview) ● Fully decentralised column-family store ● High (almost linear) scalability ● No single point of failure (no “master” or “slave” nodes) ● Automatic replication ● Clients can read and write to any node in cluster ● Cassandra takes over duties of partitioning and replicating automatically
  • 22. 14/08/13 Rui Vieira, MSc ITEC 22 Efficient top-k query processing on distributed column family databases Cassandra (architecture overview)Cassandra (architecture overview) ● Automatic partitioning of data (commonly used is Random partitioning) ● Rows are distributed in nodes by hash of partition key (1st PK) "2013-08-14" id = O 1 score = 7919 column table foo nodeA nodeB nodeC nodeD hashing (MD5) on key ... id = O n score = 9109 id = O 1 score = 1219 ... id = O n score = 109 id = O 1 score = 59 ... id = O n score = 91 id = O 1 score = 7919 ... id = On score = 9109 id = O 1 score = 1219 ... id = On score = 109 id = O 1 score = 59 ... id = On score = 91 "2013-08-15" "2013-08-16" "2013-08-14" "2013-08-15" "2013-08-16"
  • 23. 14/08/13 Rui Vieira, MSc ITEC 23 Efficient top-k query processing on distributed column family databases Cassandra (data model) ● Columns to be ordered upon insertion (ordered by PKs) ● Columns in the same row are physically co-located ● Range searches are fast: score < 10000 (simply a linear seek on disk) "2013-08-16" id = O 1 score = 7919 id = O 2 score = 7901 column Comparator is id (ascending) "2013-08-16" id = O 1 score = 7919 id = O 2 score = 7901 column Comparator is score (ascending) table_forward table_reverse
  • 24. 14/08/13 Rui Vieira, MSc ITEC 24 Efficient top-k query processing on distributed column family databases Cassandra (CQL) Data manipulation language for Cassandra is CQL ● Similar in syntax to SQL INSERT INTO table (foo, bar) VALUES (42, 'Meaning') SELECT foo, bar FROM table WHERE foo = 42 Limitations ● No joins, unions or sub-selects ● No aggregation functions (min, max, etc...) ● Inequality search are bound to primary key declaration order (next slide)
  • 25. 14/08/13 Rui Vieira, MSc ITEC 25 Efficient top-k query processing on distributed column family databases Cassandra (CQL) Consider the following table CREATE TABLE visits( date timestamp, user_id bigint, hits bigint, PRIMARY KEY (date, user_id)) Although the following queries would be valid SQL queries They are not valid CQL: SELECT * FROM visits WHERE hits > 1000 SELECT * FROM visits WHERE user_id > 900 AND hits = 0 Inequality queries are restricted to PKs and return contiguous columns, such as SELECT * FROM visits WHERE date = 1368438171000 AND user_id > 1000
  • 26. 14/08/13 Rui Vieira, MSc ITEC 26 Efficient top-k query processing on distributed column family databases Implementation
  • 27. 14/08/13 Rui Vieira, MSc ITEC 27 Efficient top-k query processing on distributed column family databases Implementation (overview) Query Coordinator peer1 peer2 peern Peer interface driver nodeA nodeB nodeC nodeD asynchronous call asynchronous call asynchronous call callbackcallbackcallbackasynchronous callcallbackasynchronous callcallbackasynchronous callcallback KLEE HT TPUT JVM
  • 28. 14/08/13 Rui Vieira, MSc ITEC 28 Efficient top-k query processing on distributed column family databases Implementation: challenges Implement forward and reverse tables to allow lookup by score and id ● Space is cheap ● Space is even cheaper as Cassandra uses in-built data compression ● Space is even cheaper as denormalised data usually compresses better than normalised data. ● Advantage of scores columns being pre-ordered at the row level "2013-08-16" id = O 1 score = 7919 id = O 2 score = 7901 column Comparator is id (ascending) "2013-08-16" id = O 1 score = 7919 id = O 2 score = 7901 column Comparator is score (ascending) table_forward table_reverse
  • 29. 14/08/13 Rui Vieira, MSc ITEC 29 Efficient top-k query processing on distributed column family databases Implementation: challenges Map algorithmic steps to CQL logic Decompose tasks ● Single step in algorithm: (node can execute arbitrary code) ● Multiple step in this implementation: (we can only communicate with node via CQL) peeri Query Coordinator select O > max(T, S lowest ) List of candidates determines local lowest scoring, S lowest T peeri Query Coordinator T i= max(T, S lowest ) List of candidates determines local lowest scoring, S lowest candidates peeri fetch > T i objects
  • 30. 14/08/13 Rui Vieira, MSc ITEC 30 Efficient top-k query processing on distributed column family databases Implementation: TPUT (phase 1) • Query Coordinator (QC) asks for top-k list from each peer 1..m invoking Peer async methods • QC stores a set of all distinct objects received in a concurrent safe collection • QC calculates a partial sum for each object using a thread-safe Accumulator data structure. Lets assume the partial sums are: [O89 , 1590] , [O73 , 1590], [O1 , 1000], [O21 , 990], [O12 , 880], [O51 , 780], [O801 , 680] Calculate the first threshold: S psum(O)=S peer1 ' (O)+…+S peerm ' (O) T = τ1 m Si ' (O)={Si (O) if O hasbeenreturned by node i 0 if otherwise } 1000, O1 900 , O 89 800, O 73 700, O 51 600, O 21 500, O 801 300, O 780 200, O 12 1, O 99 ... peer 1 190, O 1 690, O 89 790, O 73 590, O 51 990, O 21 390, O 801 10, O 780 490, O 12 290, O 99 ... peer 2 580, O 1 7, O 89 380, O 73 780, O 51 480, O 21 680, O 801 280, O 780 880, O 12 180, O 99 ... peer n Query Coordinator fetch top- k ... inverse table
  • 31. 14/08/13 Rui Vieira, MSc ITEC 31 Efficient top-k query processing on distributed column family databases Implementation: TPUT (phase 2) QC issues a requests for all objects with a score > T from the inverse table (peer.getAbove(T)) With the received objects, recalculates the partial sum. (for each Pair → accumulator.add(pair)) Designates the kth partial sum as t2 = accumulator.getKthValue(k) 1000, O1 900 , O 89 800, O 73 700, O 51 600, O 21 500, O 801 300, O 780 200, O 12 1, O 99 ... peer 1 190, O 1 690, O 89 790, O 73 590, O 51 990, O 21 390, O 801 10, O 780 490, O 12 290, O 99 ... peer 2 580, O 1 7, O 89 380, O 73 780, O 51 480, O 21 680, O 801 280, O 780 880, O 12 180, O 99 ... peer n Query Coordinator fetch score > T ... inverse table
  • 32. 14/08/13 Rui Vieira, MSc ITEC 32 Efficient top-k query processing on distributed column family databases Implementation: TPUT (phase 3) ● Fetch the final candidates from the forward table. ● Call async Peer methods ● Aggregate scores and nominate k highest scoring as the top-k forward table O 1 , 1000 O 89 , 900 O 73 , 800 O 51 , 700 O 21 , 600 O 801 , 500 O 780 , 300 O 12 , 200 O 99 , 1 ... peer 1 O 1 , 190 O 89 , 690 O 73 , 790 O 51 , 590 O 21 , 990 O 801 , 390 O 780 , 10 O 12 , 490 O 99 , 290 ... peer 2 O 1 , 580 O 89 , 7 O 73 , 380 O 51 , 780 O 21 , 480 O 801 , 680 O 780 , 280 O 12 , 880 O 99 , 180 ... peer n Query Coordinator fetch final candidates ...
  • 33. 14/08/13 Rui Vieira, MSc ITEC 33 Efficient top-k query processing on distributed column family databases Implementation: challenges Sequential vs. Random lookups All algorithms at some point require random access Random access much slower than sequential forward table 1000, O1 900 , O 89 800, O 73 700, O 51 600, O 21 500, O 801 300, O 780 200, O 12 1, O 99 ... peer 1 ... peer1 inverse table sequential O1, 1000 O89, 900 O73, 800 O51, 700 O21, 600 O801, 500 O780, 300 O12, 200 O99, 1 "random" Lookup # objects Time (ms) 95% CI (ms) Sequential 240 1.70 0.27 Random 240 115.16 1.32 Sample size n = 100
  • 34. 14/08/13 Rui Vieira, MSc ITEC 34 Efficient top-k query processing on distributed column family databases Implementation: KLEE challenges Sequential vs. Random lookups As a consequence of expensive random lookups a modified KLEE3 variant was implemented KLEE3-M: In the final phase, instead of filtering candidates with Do a range scan per peer for objects with Trade-off: score< mink m score⩾ mink m data transfer execution time
  • 35. 14/08/13 Rui Vieira, MSc ITEC 35 Efficient top-k query processing on distributed column family databases CREATE TABLE table_metadata( peer text, cell int, lb double, ub double, freq bigint, avg double, binmax double, binmin double, filter blob, PRIMARY KEY (date,cell) ) WITH CLUSTERING ORDER BY (cell DESC) Implementation: KLEE challenges Mapping data structures to Cassandra's data model Serialised filter = 0x0000000600000002020100f0084263884418154205141c11
  • 36. 14/08/13 Rui Vieira, MSc ITEC 36 Efficient top-k query processing on distributed column family databases Implementation: KLEE challenges Mapping data structures to Cassandra's data model peeri determine maximum score and create n equi-width bins fetch entire row serialise Bloom filter and save row Histogram Creator cell=0 cell=1 cell=2 cell=3 cell=4 cell=5 cell= n freq =0 freq =2 freq =0 freq =10 freq =140 freq =986 freq =10234 avg =0 avg =4590.2 avg =0 avg =678.1 avg =230.1 avg =56.7 avg =1.02 partition object per bin and add to Bloom Filter filter0 filter1 filter2 filter3 filter4 filter5 filtern Flexible: ● Configurable number of bins ● Configurable maximum false positive ratio for filters
  • 37. 14/08/13 Rui Vieira, MSc ITEC 37 Efficient top-k query processing on distributed column family databases Implementation: KLEE ...row 1 Query Coordinator metadata table ... row n Peer Peer getFullHistAsync cell:0 freq,avg,filter, ... cell:1 freq,avg,filter ... cell:2 freq,avg,filter ... cell:3 freq,avg,filter ... cell:n freq,avg,filter ... ...row 1 Query Coordinator metadata table ... row n Peer Peer getPartialHistAsync cell:0 freq,avg,filter, ... cell:1 freq,avg,filter ... cell:2 freq,avg,filter ... cell:3 freq,avg ,filter ... cell:n freq,avg ,filter ... ...row 1 Query Coordinator inverse table ... row n Peer Peer getTopKAsync 1000, O1 900, O12 800, O7 700, O18 1, O 145 ResultResultResultResult ResultResultResultHistoBloom estimate min-k > min-k ...row 1 Query Coordinator forward table ... row n Peer Peer getObjectsAsync O1, 1000 O12, 900O7, 800 O18, 700 O145, 1 ResultResultResultResult aggregate
  • 38. 14/08/13 Rui Vieira, MSc ITEC 38 Efficient top-k query processing on distributed column family databases Implementation: KLEE challenges final HistogramCreator hc = new CassandraHistogramCreator(tableDefinition); // Optionally a max false positive ratio can be defined hc.createHistogramTableSchema(); hc.createHistogramTable(“1998-05-01”, … , “1998-07-26“); Simple API for Histogram/Bloom tables creation
  • 39. 14/08/13 Rui Vieira, MSc ITEC 39 Efficient top-k query processing on distributed column family databases Implementation: KLEE challenges  Fast generation ● Feasible for “on-the-fly” jobs ● Roughly linear with execution time of 56 ms per peer with 100,000 elements
  • 40. 14/08/13 Rui Vieira, MSc ITEC 40 Efficient top-k query processing on distributed column family databases Implementation: asynchronous communication ● Driver used allowed for asynchronous communication ● Extensive use of ListenableFuture ● Allows for highly concurrent access with smaller thread pool ● Allows asynchronous transformations (eg ResultSet to POJO) public ListenableFuture<ResultList> getAboveAsync(final long value) { final ResultSetFuture above = session.executeAsync(statement.bind(value)); final Function<ResultSet, ResultList> transformResults = new Function<ResultSet, ResultList>() { @Override public ListenableFuture<ResultList> apply(ResultSet rs) { final ResultList resultList = new ResultList(); final List<Row> rows = rs.all(); for (final Row row : rows) { resultList.add( Pair.create(row.getBytes(object.getName()), row.getLong(score.getName())) ); } return resultList; } }; return Futures.transform(above, transformResults, executor); }
  • 41. 14/08/13 Rui Vieira, MSc ITEC 41 Efficient top-k query processing on distributed column family databases Implementation: API { "wc98_ids": { "name": "wc98_ids", "inverse": "wc98_ids_inverse", "metadata": "wc98_ids_metadata", "score": { "name": "visits", "type": "bigint" }, "id": { "name": "id", "type": "text" }, "peer": { "name": "date", "type": "text" } } } JSON declaration of tables and columns final QueryCoordinator coordinator = QueryCoordinator.create(KLEE.class, tableDefinition); coordinator.setKeys(“1998-05-01”, … , “1998-07-26”); final List<Pair> topK = coordinator.getTopK(10);
  • 42. 14/08/13 Rui Vieira, MSc ITEC 42 Efficient top-k query processing on distributed column family databases Datasets Test data
  • 43. 14/08/13 Rui Vieira, MSc ITEC 43 Efficient top-k query processing on distributed column family databases Datasets: Synthetic (Zipf) Used in literature as a good approximation of “real-world” data
  • 44. 14/08/13 Rui Vieira, MSc ITEC 44 Efficient top-k query processing on distributed column family databases Datasets: 1998 World Cup Data ● Data in Common Log Format (CLF) from the 1998 World Cup web servers ● IP addresses replaced by unique anonymous id ● Widely used in the literature as “real-world” test data ● Around 1.4 billion entries (approximately 2 million unique visitors) ● Range from 1st of May to 26th of July 1998 ● Highly skewed data
  • 45. 14/08/13 Rui Vieira, MSc ITEC 45 Efficient top-k query processing on distributed column family databases Results
  • 46. 14/08/13 Rui Vieira, MSc ITEC 46 Efficient top-k query processing on distributed column family databases Results: varying k
  • 47. 14/08/13 Rui Vieira, MSc ITEC 47 Efficient top-k query processing on distributed column family databases Results: varying number of peers
  • 48. 14/08/13 Rui Vieira, MSc ITEC 48 Efficient top-k query processing on distributed column family databases Results: Datasets (1998 World Cup Data) Algorithm Data (KB) Execution time (ms) 95% CI (ms) Precision (%) KLEE3 80 319.95 ±8.58 100 KLEE3-M 1271 84.75 ±6.5 100 Hybrid Threshold 14,306 1921.9 ±65.28 100 TPUT 44 141.5 ±7.36 100 Naive (baseline) 43,572 8514.6 ±61.38 100 Data for 18 peers = daily from 1st June 1998 to 18th June 1998 Sample size n = 20 Give me the top 20 visitors from 1st June to 18th June
  • 49. 14/08/13 Rui Vieira, MSc ITEC 49 Efficient top-k query processing on distributed column family databases Implementation: Pre-aggregation Mix and match keys for aggregation results "2013-08" 192.0.43.10192.0.43.11 "2013-08-02" 192.0.43.10 98 192.0.43.11 234 96327404 "2013-09" 192.0.43.10 5398 192.0.43.11 23234 "2013-08-01" 192.0.43.10 98 192.0.43.11 234coordinator .setKeys(“1998-05”, “1998-06”, “1998-07-01”, “1998-07-02”); final List<Pair> topK = coordinator.getTopK(10); Mix and match keys for aggregation results top-k results the same, but computed over 4 peers instead of 63 peers.
  • 50. 14/08/13 Rui Vieira, MSc ITEC 50 Efficient top-k query processing on distributed column family databases Results: Pre-aggregation Algorithm Data transfer (KB) Execution time (ms) full aggregated savings full aggregated savings KLEE 20756 633 97% 2412.2 44.3 98% HT 14404 5894 59% 4842.6 818.6 83% TPUT 2215 61 97% 1657.1 162.2 90%
  • 51. 14/08/13 Rui Vieira, MSc ITEC 51 Efficient top-k query processing on distributed column family databases Conclusions
  • 52. 14/08/13 Rui Vieira, MSc ITEC 52 Efficient top-k query processing on distributed column family databases Conclusions • TPUT and HT are well suited for real-time top-k queries with minimal structural changes in the infrastructure. • Savings of 98% (TPUT) and 77% (HT) in execution time with no loss of precision • Savings of 99.9% (TPUT) and 67% (HT) in data transfer also with no loss of precision • KLEE3 requires additional changes to infrastructure, but: • Efficient to create • Can discard final patch phase for approximate results with configurable trade-off between precision and data transfer / execution time • Savings of 99% in execution time and 97% in data transfer
  • 53. 14/08/13 Rui Vieira, MSc ITEC 53 Efficient top-k query processing on distributed column family databases Conclusions • Scalability can be addressed with good planning of data models together with pre-aggregation • KLEE3 is more resilient to low object correlation (the case in real • world data) • TPUT and KLEE3 are resilient to high k variations which could have further practical implementations
  • 54. 14/08/13 Rui Vieira, MSc ITEC 54 Efficient top-k query processing on distributed column family databases Future work Implementing KLEE4 ● Intravert1 is an application server built on top of a Cassandra node ● Based on the vert.x application framework ● Communication is done either in a RESTful way or directly with Java client ● Allows passing code (in several JVM languages such as Groovy, Clojure, etc) which is executed at the “server side” ● Acting as middleware, it is possible to implement processing (such as the candidate hash set) remotely and return it to our client ● TPUT and HT already implemented using Intravert ● KLEE4 in progress 1- https://github.com/zznate/intravert-ug
  • 55. 14/08/13 Rui Vieira, MSc ITEC 55 Efficient top-k query processing on distributed column family databases Acknowledgements Jonathan Halliday (Red Hat) For technical expertise, supervision and support
  • 56. 14/08/13 Rui Vieira, MSc ITEC 56 Efficient top-k query processing on distributed column family databases Questions ?