Kudu: A Columnar Store for Fast Analytics on Fast Data

1
©
Cloudera,
Inc.
All
rights
reserved.

David
Alves
on
behalf
of
the
Kudu
team

Kudu:
Resolving
Transac@onal

and
Analy@c
Trade-‐oﬀs
in

Hadoop

1

2
©
Cloudera,
Inc.
All
rights
reserved.

Kudu

Storage
for
Fast
Analy@cs
on
Fast
Data

•  New
upda@ng
column
store
for

Hadoop

•  Apache-‐licensed
open
source

•  Beta
now
available

Columnar
Store

Kudu

3
©
Cloudera,
Inc.
All
rights
reserved.

Mo@va@on
and
Goals

Why
build
Kudu?

3

4
©
Cloudera,
Inc.
All
rights
reserved.

Mo@va@ng
Ques@ons

•  Are
there
user
problems
that
can
we
can’t
address
because
of
gaps
in
Hadoop

ecosystem
storage
technologies?

•  Are
we
posi@oned
to
take
advantage
of
advancements
in
the
hardware

landscape?

5
©
Cloudera,
Inc.
All
rights
reserved.

Current
Storage
Landscape
in
Hadoop

HDFS
excels
at:

•  Efficiently
scanning
large
amounts

of
data

•  Accumula@ng
data
with
high

throughput

HBase
excels
at:

•  Efficiently
finding
and
wri@ng

individual
rows

•  Making
data
mutable

Gaps
exist
when
these
proper@es

are
needed
simultaneously

6
©
Cloudera,
Inc.
All
rights
reserved.

Changing
Hardware
landscape

•  Spinning
disk
-‐>
solid
state
storage

• NAND
ﬂash:
Up
to
450k
read
250k
write
iops,
about
2GB/sec
read
and
1.5GB/
sec
write
throughput,
at
a
price
of
less
than
$3/GB
and
dropping

• 3D
XPoint
memory
(1000x
faster
than
NAND,
cheaper
than
RAM)

•  RAM
is
cheaper
and
more
abundant:

• 64-‐>128-‐>256GB
over
last
few
years

•  Takeaway
1:
The
next
bo?leneck
is
CPU,
and
current
storage
systems
weren’t

designed
with
CPU
eﬃciency
in
mind.

•  Takeaway
2:
Column
stores
are
feasible
for
random
access

7
©
Cloudera,
Inc.
All
rights
reserved.

•  High
throughput
for
big
scans
(columnar

storage
and
replica@on)

Goal:
Within
2x
of
Parquet

•  Low-‐latency
for
short
accesses
(primary
key

indexes
and
quorum
replica@on)

Goal:
1ms
read/write
on
SSD

•  Database-‐like
seman@cs
(ini@ally
single-‐row

ACID)

•  RelaHonal
data
model

•  SQL
query

•  “NoSQL”
style
scan/insert/update
(Java
client)

Kudu
Design
Goals

8
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
Design
Goals

how effectively primary key filters can be pushed down to Kudu.
What do I use Kudu for?
We talked about how Kudu is made for SQL, allows fast scans, and allows fast mutability at
scale. With that in context, let’s look at the variety of use cases done in Hadoop today and see
where Kudu fits in.

If we look at Kudu in the above figure, we will see that many of the traditional SQL use cases

9
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
Usage

•  Table
has
a
SQL-‐like
schema

• Finite
number
of
columns
(unlike
HBase/Cassandra)

• Types:
BOOL,
INT8,
INT16,
INT32,
INT64,
FLOAT,
DOUBLE,
STRING,
BINARY,

TIMESTAMP

• Some
subset
of
columns
makes
up
a
possibly-‐composite
primary
key

• Fast
ALTER
TABLE

•  Java
and
C++
“NoSQL”
style
APIs

• Insert(),
Update(),
Delete(),
Scan()

•  Integra@ons
with
MapReduce,
Spark,
and
Impala

• more
to
come!

9

10
©
Cloudera,
Inc.
All
rights
reserved.

Use
cases
and
architectures

11
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
Use
Cases

Kudu
is
best
for
use
cases
requiring
a
simultaneous
combinaHon
of

sequenHal
and
random
reads
and
writes

● Time
Series

○  Examples:
Stream
market
data;
fraud
detec@on
&
preven@on;
risk
monitoring

○  Workload:
Insert,
updates,
scans,
lookups

● Machine
Data
AnalyHcs

○  Examples:
Network
threat
detec@on

○  Workload:
Inserts,
scans,
lookups

● Online
ReporHng

○  Examples:
ODS

○  Workload:
Inserts,
updates,
scans,
lookups

12
©
Cloudera,
Inc.
All
rights
reserved.

Real-‐Time
Analy@cs
in
Hadoop
Today

Fraud
Detec@on
in
the
Real
World
=
Storage
Complexity

ConsideraHons:

●  How
do
I
handle
failure

during
this
process?

●  How
oten
do
I
reorganize

data
streaming
in
into
a

format
appropriate
for

repor@ng?

●  When
repor@ng,
how
do
I
see

data
that
has
not
yet
been

reorganized?

●  How
do
I
ensure
that

important
jobs
aren’t

interrupted
by
maintenance?

New
Par@@on

Most
Recent
Par@@on

Historic
Data

HBase

Parquet

File

Have
we

accumulated

enough
data?

Reorganize

HBase
file

into
Parquet

•  Wait
for
running
opera@ons
to
complete

•  Define
new
Impala
par@@on
referencing

the
newly
wriwen
Parquet
file

Incoming
Data

(Messaging

System)

Repor@ng

Request

Impala
on
HDFS

13
©
Cloudera,
Inc.
All
rights
reserved.

Real-‐Time
Analy@cs
in
Hadoop
with
Kudu

Improvements:

●  One
system
to
operate

●  No
cron
jobs
or
background

processes

●  Handle
late
arrivals
or
data

correcHons
with
ease

●  New
data
available

immediately
for
analyHcs
or

operaHons

Historical
and
Real-‐@me

Data

Incoming
Data

(Messaging

System)

Repor@ng

Request

Storage
in
Kudu

14
©
Cloudera,
Inc.
All
rights
reserved.

How
it
works

Replica@on
and
distribu@on

14

15
©
Cloudera,
Inc.
All
rights
reserved.

Tables
and
Tablets

•  Table
is
horizontally
parHHoned
into
tablets

• Range
or
hash
par@@oning

• PRIMARY KEY (host, metric, timestamp) DISTRIBUTE BY
HASH(timestamp) INTO 100 BUCKETS
•  Each
tablet
has
N
replicas
(3
or
5),
with
RaX
consensus

• Allow
read
from
any
replica,
plus
leader-‐driven
writes
with
low
MTTR

•  Tablet
servers
host
tablets

• Store
data
on
local
disks
(no
HDFS)

15

16
©
Cloudera,
Inc.
All
rights
reserved.

Metadata

•  Replicated
master*

• Acts
as
a
tablet
directory
(“META”
table)

• Acts
as
a
catalog
(table
schemas,
etc)

• Acts
as
a
load
balancer
(tracks
TS
liveness,
re-‐replicates
under-‐replicated

tablets)

•  Caches
all
metadata
in
RAM
for
high
performance

• 80-‐node
load
test,
GetTableLoca@ons
RPC
perf:

•  99th
percen@le:
68us,

99.99th
percen@le:
657us

•  <2%
peak
CPU
usage

•  Client
conﬁgured
with
master
addresses

• Asks
master
for
tablet
loca@ons
as
needed
and
caches
them

16

17
©
Cloudera,
Inc.
All
rights
reserved.

18
©
Cloudera,
Inc.
All
rights
reserved.

Rat
consensus

18

TS
A

Tablet
1

(LEADER)

Client

TS
B

Tablet
1

(FOLLOWER)

TS
C

Tablet
1

(FOLLOWER)

WAL

WAL
WAL

2b.
Leader
writes
local
WAL

1a.
Client-‐>Leader:
Write()
RPC

2a.
Leader-‐>Followers:

UpdateConsensus()
RPC

3.
Follower:
write
WAL

4.
Follower-‐>Leader:
success

3.
Follower:
write
WAL

5.
Leader
has
achieved
majority

6.
Leader-‐>Client:
Success!

19
©
Cloudera,
Inc.
All
rights
reserved.

Fault
tolerance

•  Transient
FOLLOWER
failure:

• Leader
can
s@ll
achieve
majority

• Restart
follower
TS
within
5
min
and
it
will
rejoin
transparently

•  Transient
LEADER
failure:

• Followers
expect
to
hear
a
heartbeat
from
their
leader
every
1.5
seconds

• 3
missed
heartbeats:
leader
elec@on!

•  New
LEADER
is
elected
from
remaining
nodes
within
a
few
seconds

• Restart
within
5
min
and
it
rejoins
as
a
FOLLOWER

•  N
replicas
handle
(N-‐1)/2
failures

19

20
©
Cloudera,
Inc.
All
rights
reserved.

Fault
tolerance
(2)

•  Permanent
failure:

• Leader
no@ces
that
a
follower
has
been
dead
for
5
minutes

• Evicts
that
follower

• Master
selects
a
new
replica

• Leader
copies
the
data
over
to
the
new
one,
which
joins
as
a
new
FOLLOWER

20

21
©
Cloudera,
Inc.
All
rights
reserved.

How
it
works

Storage
engine
internals

21

22
©
Cloudera,
Inc.
All
rights
reserved.

Tablet
design

•  Inserts
buﬀered
in
an
in-‐memory
store
(like
HBase’s
memstore)

•  Flushed
to
disk

• Columnar
layout,
similar
to
Apache
Parquet

•  Updates
use
MVCC
(updates
tagged
with
@mestamp,
not
in-‐place)

• Allow
“SELECT
AS
OF
<@mestamp>”
queries
and
consistent
cross-‐tablet
scans

•  Near-‐op@mal
read
path
for
“current
@me”
scans

• No
per
row
branches,
fast
vectorized
decoding
and
predicate
evalua@on

•  Performance
worsens
based
on
number
of
recent
updates

22

23
©
Cloudera,
Inc.
All
rights
reserved.

LSM
vs
Kudu

•  LSM
–
Log
Structured
Merge
(Cassandra,
HBase,
etc)

• Inserts
and
updates
all
go
to
an
in-‐memory
map
(MemStore)
and
later
flush
to

on-‐disk
files
(HFile/SSTable)

• Reads
perform
an
on-‐the-‐fly
merge
of
all
on-‐disk
HFiles

•  Kudu

• Shares
some
traits
(memstores,
compac@ons)

• More
complex.

• Slower
writes
in
exchange
for
faster
reads
(especially
scans)

23

24
©
Cloudera,
Inc.
All
rights
reserved.

LSM
Insert
Path

24

MemStore

INSERT

Row=r1
col=c1
val=“blah”

Row=r1
col=c2
val=“1”

HFile
1

Row=r1
col=c1
val=“blah”

Row=r1
col=c2
val=“1”

ﬂush

25
©
Cloudera,
Inc.
All
rights
reserved.

LSM
Insert
Path

25

MemStore

INSERT

Row=r1
col=c1
val=“blah”

Row=r1
col=c2
val=“2”

HFile
2

Row=r2
col=c1
val=“blah2”

Row=r2
col=c2
val=“2”

ﬂush

HFile
1

Row=r1
col=c1
val=“blah”

Row=r1
col=c2
val=“1”

26
©
Cloudera,
Inc.
All
rights
reserved.

LSM
Update
path

26

MemStore

UPDATE

HFile
1

Row=r1
col=c1
val=“blah”

Row=r1
col=c2
val=“2”

HFile
2

Row=r2
col=c1
val=“v2”

Row=r2
col=c2
val=“5”

Row=r2
col=c1
val=“newval”

Note:
all
updates
are
“fully

decoupled”
from
reads.
Random-‐
write
workload
is
transformed
to

fully
sequen@al!

27
©
Cloudera,
Inc.
All
rights
reserved.

LSM
Read
path

27

MemStore

HFile
1

Row=r1
col=c1
val=“blah”

Row=r1
col=c2
val=“2”

HFile
2

Row=r2
col=c1
val=“v2”

Row=r2
col=c2
val=“5”

Row=r2
col=c1
val=“newval”

Merge
based
on
string
row

keys

R1:
c1=blah
c2=2

R2:
c1=newval
c2=5

….

CPU
intensive!

Must
always
read

rowkeys

Any
given
row
may
exist
across

mul@ple
HFiles:
must
always

merge!

The
more
HFiles
to
merge,
the

slower
it
reads

28
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
storage
–
Inserts
and
Flushes

28

MemRowSet

INSERT
(“todd”,

“$1000”,”engineer”)

name
pay
role

DiskRowSet
1

ﬂush

29
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
storage
–
Inserts
and
Flushes

29

MemRowSet

name
pay
role

DiskRowSet
1

name
pay
role

DiskRowSet
2

INSERT
(“doug”,
“$1B”,
“Hadoop
man”)

ﬂush

30
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
storage
-‐
Updates

30

MemRowSet

name
pay
role

DiskRowSet
1

name
pay
role

DiskRowSet
2

Delta
MS

Delta
MS

Each
DiskRowSet
has
its
own

DeltaMemStore
to

accumulate
updates

base
data

base
data

31
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
storage
-‐
Updates

31

MemRowSet

name
pay
role

DiskRowSet
1

name
pay
role

DiskRowSet
2

Delta
MS

Delta
MS

UPDATE
set
pay=“$1M”

WHERE
name=“todd”

Is
the
row
in
DiskRowSet
2?

(check
bloom
filters)

Is
the
row
in
DiskRowSet
1?

(check
bloom
filters)

Bloom
says:
no!

Bloom
says:
maybe!

Search
key
column
to
find

offset:
rowid
=
150

150:
col
1=$1M

base
data

32
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
storage
–
Read
path

32

MemRowSet

name
pay
role

DiskRowSet
1

name
pay
role

DiskRowSet
2

Delta
MS

Delta
MS

150:
pay=$1M

Read
rows
in
DiskRowSet
2

Then,
read
rows
in

DiskRowSet
1

Any
row
is
only
in
exactly
one

DiskRowSet–
no
need
to
merge
cross-‐
DRS!

Updates
are
merged
based
on
ordinal

oﬀset
within
DRS:
array
indexing,
no

string
compares

base
data

base
data

33
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
storage
–
Delta
ﬂushes

33

MemRowSet

name
pay
role

DiskRowSet
1

name
pay
role

DiskRowSet
2

Delta
MS

Delta
MS

0:
pay=foo
REDO
DeltaFile

Flush

A
REDO
delta
indicates
how
to

transform
between
the
‘base

data’
(columnar)
and
a
later
version

base
data

base
data

34
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
storage
–
Major
delta
compac@on

34

name
pay
role

DiskRowSet(pre-‐compac@on)

Delta
MS

REDO
DeltaFile
REDO
DeltaFile
REDO
DeltaFile

Many
deltas
accumulate:
lots
of
delta
applica@on

work
on
reads

name
pay
role

DiskRowSet(post-‐compac@on)

Delta
MS

Unmerged
REDO

deltas
UNDO
deltas

If
a
column
has
few
updates,
doesn’t
need
to
be
re-‐
wriwen:
those
deltas
maintained
in
new
DeltaFile

Merge
updates
for
columns
with
high
update

percentage

base
data

35
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
storage
–
RowSet
Compac@ons

35

DRS
1
(32MB)

[PK=alice],

[PK=joe],

[PK=linda],

[PK=zach]

DRS
2
(32MB)

[PK=bob],

[PK=jon],

[PK=mary]

[PK=zeke]

DRS
3
(32MB)

[PK=carl],

[PK=julie],

[PK=omar]

[PK=zoe]

DRS
4
(32MB)
DRS
5
(32MB)
DRS
6
(32MB)

[alice,
bob,
carl,

joe]

[jon,
julie,
linda,

mary]

[omar,
zach,

zeke,
zoe]

Reorganize
rows
to
avoid
rowsets

with
overlapping
key
ranges

36
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
storage
–
Compac@on
policy

•  Solves
an
op@miza@on
problem
(knapsack
problem)

•  Minimize
“height”
of
rowsets
for
the
average
key
lookup

• Bound
on
number
of
seeks
for
write
or
random-‐read

•  Restrict
total
IO
of
any
compac@on
to
a
budget
(128MB)

• No
long
compacHons,
ever

• No
“minor”
vs
“major”
disHncHon

• Always
be
compac@ng
or
ﬂushing

• Low
IO
priority
maintenance
threads

36

37
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
trade-‐oﬀs

•  Random
updates
will
be
slower

• HBase
model
allows
random
updates
without
incurring
a
disk
seek

• Kudu
requires
a
key
lookup
before
update,
bloom
lookup
before
insert

•  Single-‐row
reads
may
be
slower

• Columnar
design
is
op@mized
for
scans

• Future:
may
introduce
“column
groups”
for
applica@ons
where
single-‐row

access
is
more
important

• Especially
slow
at
reading
a
row
that
has
had
many
recent
updates
(e.g
YCSB

“zipﬁan”)

37

38
©
Cloudera,
Inc.
All
rights
reserved.

Benchmarks

38

39
©
Cloudera,
Inc.
All
rights
reserved.

TPC-‐H
(Analy@cs
benchmark)

•  75TS
+
1
master
cluster

• 12
(spinning)
disk
each,
enough
RAM
to
ﬁt
dataset

• Using
Kudu
0.5.0,
Impala
2.2
with
Kudu
support,
CDH
5.4

• TPC-‐H
Scale
Factor
100
(100GB)

•  Example
query:

•  SELECT n_name, sum(l_extendedprice * (1 - l_discount)) as revenue FROM customer,
orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND
l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey
AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA'
AND o_orderdate >= date '1994-01-01' AND o_orderdate < '1995-01-01’ GROUP BY
n_name ORDER BY revenue desc;
39

40
©
Cloudera,
Inc.
All
rights
reserved.

-‐
Kudu
outperforms
Parquet
by
31%
(geometric
mean)
for
RAM-‐resident
data

-‐
Parquet
likely
to
outperform
Kudu
for
HDD-‐resident
(larger
IO
requests)

41
©
Cloudera,
Inc.
All
rights
reserved.

What
about
Apache
Phoenix?

•  10
node
cluster
(9
worker,
1
master)

•  HBase
1.0,
Phoenix
4.3

•  TPC-‐H
LINEITEM
table
only
(6B
rows)

41

2152

219

76

131

0.04

1918

13.2

1.7

0.7

0.15

155

9.3

1.4
1.5
1.37

0.01

0.1

1

10

100

1000

10000

Load
TPCH
Q1
COUNT(*)

COUNT(*)

WHERE…

single-‐row

lookup

Time
(sec)

Phoenix

Kudu

Parquet

44
©
Cloudera,
Inc.
All
rights
reserved.

About
Xiaomi

Mobile
Internet
Company
Founded
in
2010
Smartphones SoXware
E-‐commerce
MIUI
Cloud
Services
App
Store/Game
Payment/Finance
…
Smart
Home
Smart
Devices

45
©
Cloudera,
Inc.
All
rights
reserved.

Big
Data
AnalyHcs
Pipeline

Before
Kudu
•  Long
pipeline

high
latency(1
hour
~
1
day),
data
conversion
pains

•  No
ordering

Log
arrival(storage)
order
not
exactly
logical
order

e.g.
read
2-‐3
days
of
log
for
data
in
1
day

46
©
Cloudera,
Inc.
All
rights
reserved.

Big
Data
Analysis
Pipeline

Simpliﬁed
With
Kudu
•  ETL
Pipeline(0~10s
latency)

Apps
that
need
to
prevent
backpressure
or
require
ETL

•  Direct
Pipeline(no
latency)

Apps
that
don’t
require
ETL
and
no
backpressure
issues

OLAP
scan

Side
table
lookup

Result
store

47
©
Cloudera,
Inc.
All
rights
reserved.

Use
Case
1

Mobile
service
monitoring
and
tracing
tool
Requirements

u  High
write
throughput

>5
Billion
records/day
and
growing

u  Query
latest
data
and
quick
response

Iden@fy
and
resolve
issues
quickly

u  Can
search
for
individual
records

Easy
for
troubleshoo@ng

Gather
important
RPC
tracing
events
from

mobile
app
and
backend
service.

Service
monitoring
&
troubleshoo@ng
tool.

48
©
Cloudera,
Inc.
All
rights
reserved.

Use
Case
1:
Benchmark

Environment

u  71
Node
cluster

u  Hardware

CPU:
E5-‐2620
2.1GHz
*
24
core

Memory:
64GB

Network:
1Gb

Disk:
12
HDD

u  Sotware

Hadoop2.6/Impala
2.1/Kudu

Data

u  1
day
of
server
side
tracing
data

~2.6
Billion
rows

~270
bytes/row

17
columns,
5
key
columns

49
©
Cloudera,
Inc.
All
rights
reserved.

Use
Case
1:
Benchmark
Results

1.4

2.0

2.3

3.1

1.3

0.9

1.3

2.8

4.0

5.7

7.5

16.7

Q1
Q2
Q3
Q4
Q5
Q6

kudu

parquet

Total
Time(s)
Throughput(Total)
Throughput(per
node)

Kudu
961.1
2.8M
record/s
39.5k
record/s

Parquet
114.6
23.5M
record/s
331k
records/s

Bulk
load
using
impala
(INSERT
INTO):

Query
latency:

*
HDFS
parquet
ﬁle
replica@on
=
3
,
kudu
table
replica@on
=
3

*
Each
query
run
5
@mes
then
take
average

50
©
Cloudera,
Inc.
All
rights
reserved.

Use
Case
1:
Result
Analysis

u  Lazy
materializa@on

Ideal
for
search
style
query

Q6
returns
only
a
few
records
(of
a
single
user)
with
all
columns

u  Scan
range
pruning
using
primary
index

Predicates
on
primary
key

Q5
only
scans
1
hour
of
data

u  Future
work

Primary
index:
speed-‐up
order
by
and
dis@nct

Hash
Par@@oning:
speed-‐up
count(dis@nct),
no
need
for
global

shuﬄe/merge

51
©
Cloudera,
Inc.
All
rights
reserved.

Use
Case
2

OLAP
PaaS
for
ecosystem
cloud
u  Provide
big
data
service
for
smart
hardware
startups
(Xiaomi’s

ecosystem
members)

u  OLAP
database
with
some
OLTP
features

u  Manage/Ingest/query
your
data
and
serving
results
in
one
place

Backend/Mobile
App/Smart
Device/IoT
…

53
©
Cloudera,
Inc.
All
rights
reserved.

Kudu
is…

• NOT
a
SQL
database

•  “BYO
SQL”

• NOT
a
ﬁlesystem

•  data
must
have
tabular
structure

• NOT
a
replacement
for
HBase
or
HDFS

•  Cloudera
con@nues
to
invest
in
those
systems

•  Many
use
cases
where
they’re
s@ll
more
appropriate

• NOT
an
in-‐memory
database

•  Very
fast
for
memory-‐sized
workloads,
but
can
operate
on
larger
data
too!

53

55
©
Cloudera,
Inc.
All
rights
reserved.

Ge…ng
started
as
a
user

•  hwp://getkudu.io

•  kudu-‐user@googlegroups.com

•  Quickstart
VM

• Easiest
way
to
get
started

• Impala
and
Kudu
in
an
easy-‐to-‐install
VM

•  CSD
and
Parcels

• For
installa@on
on
a
Cloudera
Manager-‐managed
cluster

55

56
©
Cloudera,
Inc.
All
rights
reserved.

Ge…ng
started
as
a
developer

•  hwp://github.com/cloudera/kudu

• All
commits
go
here
ﬁrst

•  Public
gerrit:
hwp://gerrit.cloudera.org

• All
code
reviews
happening
here

•  Public
JIRA:
hwp://issues.cloudera.org

• Includes
bugs
going
back
to
2013.
Come
see
our
dirty
laundry!

•  kudu-‐dev@googlegroups.com

•  Apache
2.0
license
open
source

•  Contribu@ons
are
welcome
and
encouraged!

56

Kudu: A Columnar Store for Fast Analytics on Fast Data

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (18)

Similaire à Kudu: A Columnar Store for Fast Analytics on Fast Data

Similaire à Kudu: A Columnar Store for Fast Analytics on Fast Data (20)

Plus de Felicia Haggarty

Plus de Felicia Haggarty (6)

Dernier

Dernier (20)

Kudu: A Columnar Store for Fast Analytics on Fast Data