SlideShare a Scribd company logo
1 of 52
Download to read offline
1©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Jean-­‐Daniel	
  Cryans – jdcryans@cloudera.com
@jdcryans
Kudu:	
  Fast	
  Analytics	
  on	
  Fast	
  Data
2©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Why	
  Kudu?
2
3©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Current	
  Storage	
  Landscape	
  in	
  Hadoop	
  Ecosystem
HDFS (GFS)	
  excels	
  at:
• Batch	
  ingest	
  only	
  (eg hourly)
• Efficiently	
  scanning	
  large	
  amounts	
  
of	
  data	
  (analytics)
HBase (BigTable)	
  excels	
  at:
• Efficiently	
  finding	
  and	
  writing	
  
individual	
  rows
• Making	
  data	
  mutable
Gaps	
  exist	
  when	
  these	
  properties	
  
are	
  needed	
  simultaneously
4©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
• High	
  throughput	
  for	
  big	
  scans
Goal: Within	
  2x	
  of	
  Parquet
• Low-­‐latency	
  for	
  short	
  accesses	
  
Goal: 1ms	
  read/write	
  on	
  SSD
• Database-­‐like semantics	
  (initially	
  single-­‐row	
  
ACID)
• Relational	
  data	
  model
• SQL	
  queries	
  are	
  easy
• “NoSQL”	
  style	
  scan/insert/update	
  (Java/C++	
  client)
Kudu	
  Design	
  Goals
5©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Changing	
  Hardware	
  landscape
• Spinning	
  disk	
  -­‐>	
  solid	
  state	
  storage
• NAND	
  flash:	
  Up	
  to	
  450k	
  read	
  250k	
  write	
  iops,	
  about	
  2GB/sec	
  read	
  and	
  
1.5GB/sec	
  write	
  throughput,at	
  a	
  price	
  of	
  less	
  than	
  $3/GB	
  and	
  dropping
• 3D	
  XPoint memory (1000x	
  faster	
  than	
  NAND,	
  cheaper	
  than	
  RAM)
• RAM is	
  cheaper	
  and	
  more	
  abundant:
• 64-­‐>128-­‐>256GB	
  over	
  last	
  few	
  years
• Takeaway:	
  The next	
  bottleneck	
  is	
  CPU,	
  and	
  current	
  storage	
  systems	
  weren’t	
  
designed	
  with	
  CPU	
  efficiency	
  in	
  mind.
6©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
What’s	
  Kudu?
6
7©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Scalable	
  and	
  fast	
  tabular	
  storage
• Scalable
• Tested	
  up	
  to	
  275	
  nodes	
  (~3PB	
  cluster)
• Designed	
  to	
  scale	
  to	
  1000s	
  of	
  nodes,	
  tens	
  of	
  PBs
• Fast
• Millions of	
  read/write	
  operations	
  per	
  second	
  across	
  cluster
• Multiple	
  GB/second	
  read	
  throughput	
  per	
  node
• Tabular
• Store	
  tables	
  like	
  a	
  normal	
  database	
  (can	
  support	
  SQL,	
  Spark,	
  etc)
• Individual	
  record-­‐level	
  access	
  to	
  100+	
  billion	
  row	
  tables	
  (Java/C++/Python	
  APIs)
8©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  Usage
• Table	
  has	
  a	
  SQL-­‐like	
  schema
• Finite	
  number	
  of	
  columns	
  (unlike	
  HBase/Cassandra)
• Types:	
  BOOL,	
  INT8,	
  INT16,	
  INT32,	
  INT64,	
  FLOAT,	
  DOUBLE,	
  STRING,	
  BINARY,	
  
TIMESTAMP
• Some	
  subset	
  of	
  columns	
  makes	
  up	
  a	
  possibly-­‐composite	
  primary	
  key
• Fast	
  ALTER	
  TABLE
• Java	
  and	
  C++	
  “NoSQL”	
  style	
  APIs
• Insert(),	
  Update(),	
  Delete(),	
  Scan()
• Integrations	
  with	
  MapReduce,	
  Spark,	
  and	
  Impala
• Apache	
  Drill	
  work-­‐in-­‐progress
8
9©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Use	
  cases	
  and	
  architectures
10©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  Use	
  Cases
Kudu	
  is	
  best	
  for	
  use	
  cases	
  requiring	
  a	
  simultaneous	
  combination	
  of
sequential	
  and	
  random	
  reads	
  and	
  writes
● Time	
  Series
○ Examples:	
  Stream	
  market	
  data;	
  fraud	
  detection	
  &	
  prevention;	
  risk	
  monitoring
○ Workload:	
  Insert,	
  updates,	
  scans,	
  lookups
● Machine	
  Data	
  Analytics
○ Examples:	
  Network	
  threat	
  detection
○ Workload:	
  Inserts,	
  scans,	
  lookups
● Online	
  Reporting
○ Examples:	
  ODS
○ Workload:	
  Inserts,	
  updates,	
  scans,	
  lookups
11©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Real-­‐Time	
  Analytics	
  in	
  Hadoop	
  Today
Fraud	
  Detection	
  in	
  the	
  Real	
  World	
  =	
  Storage	
  Complexity
Considerations:
● How	
  do	
  I	
  handle	
  failure	
  
during	
   this	
  process?
● How	
  often	
  do	
  I	
  reorganize	
  
data	
  streaming	
  in	
  into	
  a	
  
format	
  appropriate	
  for	
  
reporting?
● When	
  reporting,	
   how	
  do	
  I	
  see	
  
data	
  that	
  has	
  not	
  yet	
  been	
  
reorganized?
● How	
  do	
  I	
  ensure	
  that	
  
important	
  jobs	
  aren’t	
  
interrupted	
  by	
  maintenance?
New	
  Partition
Most	
  Recent	
  Partition
Historic	
  Data
HBase
Parquet	
  
File
Have	
  we	
  
accumulated	
  
enough	
  data?
Reorganize	
  
HBase	
  file	
  
into	
  Parquet
• Wait	
  for	
  running	
  operations	
  to	
  complete	
  
• Define	
  new	
  Impala	
  partition	
  referencing	
  
the	
  newly	
  written	
  Parquet	
  file
Incoming	
  Data	
  
(Messaging	
  
System)
Reporting	
  
Request
Impala	
  on	
  HDFS
12©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Real-­‐Time	
  Analytics	
  in	
  Hadoop	
  with	
  Kudu
Improvements:
● One	
  system to	
  operate
● No	
  cron	
  jobs	
  or	
  background	
  
processes
● Handle	
  late	
  arrivals	
  or	
  data	
  
corrections	
  with	
  ease
● New	
  data	
  available	
  
immediately	
  for	
  analytics	
  or	
  
operations	
  
Historical	
  and	
  Real-­‐time
Data
Incoming	
  Data	
  
(Messaging	
  
System)
Reporting	
  
Request
Storage	
  in	
  Kudu
13©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Xiaomi	
  use	
  case
• World’s	
  4th largest	
  smart-­‐phone	
  maker	
  (most	
  popular	
  in	
  China)
• Gather	
  important	
  RPC	
  tracing	
  events	
  from	
  mobile	
  app	
  and	
  backend	
  service.	
  
• Service	
  monitoring	
  &	
  troubleshooting	
  tool.
u High	
  write	
  throughput
• >5	
  Billion	
  records/day	
  and	
  growing
u Query	
  latest	
  data	
  and	
  quick	
  response
• Identify	
  and	
  resolve	
  issues	
  quickly
u Can	
  search	
  for	
  individual	
  records
• Easy	
  for	
  troubleshooting
14©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Xiaomi	
  Big	
  Data	
  Analytics	
  Pipeline
Before	
  Kudu
• Long	
  pipeline
high	
  latency(1	
  hour	
  ~	
  1	
  day),	
  data	
  conversion	
  pains
• No	
  ordering
Log	
  arrival(storage)	
  order	
  not	
  exactly	
  logical	
  order
e.g.	
  read	
  2-­‐3	
  days	
  of	
  log	
  for	
  data	
  in	
  1	
  day
15©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Xiaomi	
  Big	
  Data	
  Analysis	
  Pipeline
Simplified	
  With	
  Kudu
• ETL	
  Pipeline(0~10s	
  latency)
Apps	
  that	
  need	
  to	
  prevent	
  backpressure	
  or	
  require	
  ETL	
  
• Direct	
  Pipeline(no	
  latency)
Apps	
  that	
  don’t	
  require	
  ETL	
  and	
  no	
  backpressure	
  issues
OLAP	
  scan
Side	
  table	
  lookup
Result	
  store
16©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
How	
  it	
  works
Replication	
  and	
  fault	
  tolerance
16
17©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Tables	
  and	
  Tablets
• Table	
  is	
  horizontally	
  partitioned	
  into	
  tablets
• Range or	
  hash partitioning
• PRIMARY KEY (host, metric, timestamp) DISTRIBUTE BY
HASH(timestamp) INTO 100 BUCKETS
• bucketNumber = hashCode(row[‘timestamp’]) % 100
• Each	
  tablet	
  has	
  N	
  replicas	
  (3	
  or	
  5),	
  with	
  Raft consensus
• Tablet	
  servers	
  host	
  tablets	
  on	
  local	
  disk	
  drives
17
18©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Metadata
• Replicated	
  master
• Acts	
  as	
  a	
  tablet	
  directory
• Acts	
  as	
  a	
  catalog	
  (which	
  tables	
  exist,	
  etc)
• Acts	
  as	
  a	
  load	
  balancer	
  (tracks	
  TS	
  liveness,	
  re-­‐replicates	
  under-­‐replicated	
  
tablets)
• Caches	
  all	
  metadata in	
  RAM	
  for	
  high	
  performance
• Client	
  configured	
  with	
  master	
  addresses
• Asks	
  master	
  for	
  tablet	
  locations	
  as	
  needed	
  and	
  caches	
  them
18
19©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Client
Hey	
  Master!	
  Where	
  is	
  the	
  row	
  for	
  
‘tlipcon’	
  in	
  table	
  “T”?
It’s	
  part	
  of	
  tablet	
  2,	
  which	
  is	
  on	
  servers	
  {Z,Y,X}.	
  
BTW,	
  here’s	
  info	
  on	
  other	
  tablets	
  you	
  might	
  
care	
  about:	
  T1,	
  T2,	
  T3,	
  …
UPDATE	
  tlipcon
SET	
  col=foo
Meta	
  Cache
T1:	
  …
T2:	
  …
T3:	
  …
20©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Raft	
  consensus
20
TS	
  A
Tablet	
  1
(LEADER)
Client
TS	
  B
Tablet	
  1
(FOLLOWER)
TS	
  C
Tablet	
  1
(FOLLOWER)
WAL
WALWAL
2b.	
  Leader	
  writes	
  local	
  WAL
1a.	
  Client-­‐>Leader:	
  Write()	
  RPC
2a.	
  Leader-­‐>Followers:	
  
UpdateConsensus()	
  RPC
3.	
  Follower:	
  write	
  WAL
4.	
  Follower-­‐>Leader:	
  success
3.	
  Follower:	
  write	
  WAL
5.	
  Leader	
  has	
  achieved	
  majority
6.	
  Leader-­‐>Client:	
  Success!
21©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Fault	
  tolerance
• Transient	
  failures:
• Follower:	
  Leader	
  can	
  still	
  achieve	
  majority
• Leader:	
  automatic	
  leader	
  election	
  (~5	
  seconds)
• Restart	
  dead	
  TS	
  within	
  5	
  min	
  and	
  it	
  will	
  rejoin	
  transparently
• Permanent	
  failures
• After	
  5	
  minutes,	
  automatically	
  creates	
  a	
  new	
  follower	
  replica	
  and	
  copies	
  data
• N	
  replicas	
  handle	
  (N-­‐1)/2	
  failures
21
22©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
How	
  it	
  works
Columnar	
  storage
22
23©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Columnar	
  storage
{25059873,	
  
22309487,	
  
23059861,	
  
23010982}
Tweet_id
{newsycbot,	
  
RideImpala,	
  
fastly,	
  
llvmorg}
User_name
{1442865158,	
  
1442828307,	
  
1442865156,	
  
1442865155}
Created_at
{Visual	
  exp…,	
  
Introducing	
  ..,	
  
Missing	
  July…,	
  
LLVM	
  3.7….}
text
24©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Columnar	
  storage
{25059873,	
  
22309487,	
  
23059861,	
  
23010982}
Tweet_id
{newsycbot,	
  
RideImpala,	
  
fastly,	
  
llvmorg}
User_name
{1442865158,	
  
1442828307,	
  
1442865156,	
  
1442865155}
Created_at
{Visual	
  exp…,	
  
Introducing	
  ..,	
  
Missing	
  July…,	
  
LLVM	
  3.7….}
text
SELECT	
  COUNT(*)	
  FROM	
  tweets	
  WHERE	
  user_name =	
  ‘newsycbot’;
Only	
  read	
  1	
  column	
  
1GB 2GB 1GB 200GB
25©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Columnar	
  compression
{1442865158,	
  
1442828307,	
  
1442865156,	
  
1442865155}
Created_at
Created_at Diff(created_at)
1442865158 n/a
1442828307 -­‐36851
1442865156 36849
1442865155 -­‐1
64	
  bits	
  each 17	
  bits	
  each
• Many	
  columns	
  can	
  compress	
  to
a	
  few	
  bits	
  per	
  row!
• Especially:
• Timestamps
• Time	
  series	
  values
• Low-­‐cardinality	
  strings
• Massive	
  space	
  savings	
  and
throughput	
  increase!
26©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Handling	
  inserts	
  and	
  updates
• Inserts	
  go	
  to	
  an	
  in-­‐memory	
  row	
  store	
  (MemRowSet)
• Durable	
  due	
  to	
  write-­‐ahead	
  logging
• Later	
  flush	
  to	
  columnar	
  format	
  on	
  disk
• Updates	
  go	
  to	
  in-­‐memory	
  “delta	
  store”
• Later	
  flush	
  to	
  “delta	
  files”	
  on	
  disk
• Eventually	
  “compact”	
  into	
  the	
  previously-­‐written	
  columnar	
  data	
  files
• Details	
  elided	
  here	
  due	
  to	
  time	
  constraints
• available	
  in	
  other	
  slide	
  decks	
  online,	
  or	
  come	
  find	
  me	
  later	
  to	
  learn	
  more!
27©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Integrations
28©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Spark	
  DataSource integration	
  (WIP)
sqlContext.load("org.kududb.spark",
Map("kudu.table" -> “foo”,
"kudu.master" -> “master.example.com”))
.registerTempTable(“mytable”)
df = sqlContext.sql(
“select col_a, col_b from mytable “ +
“where col_c = 123”)
https://github.com/tmalaska/SparkOnKudu
29©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Impala	
  integration
• CREATE TABLE … DISTRIBUTE BY HASH(col1) INTO 16
BUCKETS AS SELECT … FROM …
• INSERT/UPDATE/DELETE
• Optimizations  like  predicate  pushdown,  scan  parallelism,  more  on  the  
way
• Not  an  Impala  user?  Apache  Drill  integration  in  the  works  by  Dremio
30©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
MapReduce	
  integration
• Most	
  Kudu	
  integration/correctness	
  testing	
  via	
  MapReduce
• Multi-­‐framework	
  cluster	
  (MR	
  +	
  HDFS	
  +	
  Kudu	
  on	
  the	
  same	
  disks)
• KuduTableInputFormat /	
  KuduTableOutputFormat
•Support	
  for	
  pushing	
  predicates,	
  column	
  projections,	
  etc
31©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Performance
31
32©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
TPC-­‐H	
  (Analytics	
  benchmark)
• 75	
  server	
  cluster
• 12	
  (spinning)	
  disk	
  each,	
  enough	
  RAM	
  to	
  fit	
  dataset
• TPC-­‐H	
  Scale	
  Factor	
  100	
  (100GB)
• Example	
  query:
• SELECT n_name, sum(l_extendedprice * (1 - l_discount)) as revenue FROM customer,
orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND
l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey
AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA'
AND o_orderdate >= date '1994-01-01' AND o_orderdate < '1995-01-01’ GROUP BY
n_name ORDER BY revenue desc;
32
33©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
-­‐ Kudu	
  outperforms	
   Parquet	
  by	
  31%	
  (geometric	
  mean)	
  for	
  RAM-­‐resident	
  data
34©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Versus	
  other	
  NoSQL	
  storage
• Phoenix:	
  SQL	
  layer	
  on	
  HBase
• 10	
  node	
  cluster	
  (9	
  worker,	
  1	
  master)
• TPC-­‐H	
  LINEITEM	
  table	
  only	
  (6B	
  rows)
34
2152
219
76
131
0.04
1918
13.2
1.7
0.7
0.15
155
9.3
1.4 1.5 1.37
0.01
0.1
1
10
100
1000
10000
Load TPCH	
  Q1 COUNT(*)
COUNT(*)
WHERE…
single-­‐row
lookup
Time	
  (sec)
Phoenix
Kudu
Parquet
35©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Xiaomi	
  benchmark
• 6	
  real	
  queries	
  from	
  application	
  trace	
  analysis	
  application
• Q1:	
  SELECT	
  COUNT(*)
• Q2:	
  SELECT	
  hour,	
  COUNT(*)	
  WHERE	
  module	
  =	
  ‘foo’	
  GROUP	
  BY	
  HOUR
• Q3:	
  SELECT	
  hour,	
  COUNT(DISTINCT	
  uid)	
  WHERE	
  module	
  =	
  ‘foo’	
  AND	
  app=‘bar’	
  
GROUP	
  BY	
  HOUR
• Q4:	
  analytics	
  on	
  RPC	
  success	
  rate	
  over	
  all	
  data	
  for	
  one	
  app
• Q5:	
  same	
  as	
  Q4,	
  but	
  filter	
  by	
  time	
  range
• Q6:	
  SELECT	
  *	
  WHERE	
  app	
  =	
  …	
  AND	
  uid =	
  …	
  ORDER	
  BY	
  ts LIMIT	
  30	
  OFFSET	
  30
36©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Xiaomi:	
  Benchmark	
  Results
1.4	
   2.0	
   2.3	
  
3.1	
  
1.3	
   0.9	
  1.3	
  
2.8	
  
4.0	
  
5.7	
  
7.5	
  
16.7	
  
Q1 Q2 Q3 Q4 Q5 Q6
kudu
parquet
Query	
  latency:
*	
  HDFS	
  parquet	
  file	
  replication	
  =	
  3	
  ,	
  kudu	
  table	
  replication	
  =	
  3
*	
  Each	
  query	
  run	
  5	
  times	
  then	
  take	
  average
37©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
What	
  about	
  NoSQL-­‐style	
  random	
  access?	
  (YCSB)
• YCSB 0.5.0-­‐snapshot
• 10	
  node	
  cluster
(9	
  worker,	
  1	
  master)
• 100M	
  row	
  data	
  set
• 10M	
  operations	
  each
workload
37
38©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Getting	
  started
38
39©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Project	
  status
• Open	
  source	
  beta	
  released	
  in	
  September	
  2015
• First	
  update	
  version	
  0.6.0	
  released	
  last	
  month
• Usable	
  for	
  many	
  applications
• Have	
  not	
  experienced	
  data	
  loss,	
  reasonably	
  stable	
  (almost	
  no	
  crashes	
  reported)
• Still	
  requires	
  some	
  expert	
  assistance,	
  and	
  you’ll	
  probably	
  find	
  some	
  bugs
• Officially	
  now	
  known	
  as	
  Apache	
  Kudu(incubating)!
40©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  Community
Your	
  company	
  here!
41©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Getting	
  started	
  as	
  a	
  user
• http://getkudu.io
• kudu-­‐user@googlegroups.com
• http://getkudu-­‐slack.herokuapp.com/
• Quickstart VM
• Easiest	
  way	
  to	
  get	
  started
• Impala	
  and	
  Kudu	
  in	
  an	
  easy-­‐to-­‐install	
  VM
• CSD	
  and	
  Parcels
• For	
  installation	
  on	
  a	
  Cloudera	
  Manager-­‐managed	
  cluster
41
42©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Getting	
  started	
  as	
  a	
  developer
• http://github.com/cloudera/kudu
• All	
  commits	
  go	
  here	
  first
• Public	
  gerrit:	
  http://gerrit.cloudera.org
• All	
  code	
  reviews	
  happening	
  here
• Public	
  JIRA:	
  http://issues.cloudera.org
• Includes	
  bugs	
  going	
  back	
  to	
  2013.	
  Come	
  see	
  our	
  dirty	
  laundry!
• kudu-­‐dev@googlegroups.com
• Apache	
  2.0	
  license	
  open	
  source
• Contributions	
  are	
  welcome	
  and	
  encouraged!
42
43©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
http://getkudu.io/
@apachekudu
44©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
How	
  it	
  works
Write	
  and	
  read	
  paths
44
45©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  storage	
  – Inserts	
  and	
  Flushes
45
MemRowSet
INSERT(“todd”,	
  “$1000”,”engineer”)
name pay role
DiskRowSet 1
flush
46©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  storage	
  – Inserts	
  and	
  Flushes
46
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
INSERT(“doug”,	
  “$1B”,	
  “Hadoop	
  man”)
flush
47©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  storage	
  -­‐ Updates
47
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta	
  MS
Delta	
  MS
Each	
  DiskRowSet has	
  its	
  own	
  
DeltaMemStore to	
  
accumulate	
  updates
base	
  data
base	
  data
48©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  storage	
  -­‐ Updates
48
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta	
  MS
Delta	
  MS
UPDATE	
  set	
  pay=“$1M”	
  
WHERE	
  name=“todd”
Is	
  the	
  row	
  in	
  DiskRowSet 2?
(check	
  bloom	
  filters)
Is	
  the	
  row	
  in	
  DiskRowSet 1?
(check	
  bloom	
  filters)
Bloom	
  says:	
  no!
Bloom	
  says:	
  maybe!
Search	
  key	
  column	
  to	
  find	
  
offset:	
  rowid =	
  150
150:	
  pay=$1M
@	
  time	
  T
base	
  data
49©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  storage	
  – Read	
  path
49
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta	
  MS
Delta	
  MS
150:	
  pay=$1M	
  
@	
  time	
  T
Read	
  rows	
  in	
  DiskRowSet 2
Then,	
  read	
  rows	
  in	
  
DiskRowSet 1
Updates	
  are	
  applied	
  based	
  on	
  ordinal	
  
offset	
  within	
  DRS:	
  array	
  indexing	
  =	
  fast
base	
  data
base	
  data
50©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  storage	
  – Delta	
  flushes
50
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta	
  MS
Delta	
  MS
150:	
  pay=$1M	
  
@	
  time	
  T
REDO	
  DeltaFile
Flush
A	
  REDO delta	
  indicates	
  how	
  to	
  
transform	
  between	
  the	
  ‘base	
  data’	
  
(columnar)	
  and	
  a	
  later	
  version
base	
  data
base	
  data
51©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  storage	
  – Major	
  delta	
  compaction
51
name pay role
DiskRowSet(pre-­‐compaction)
Delta	
  MS
REDO	
  DeltaFile REDO	
  DeltaFile REDO	
  DeltaFile
Many	
  deltas	
  accumulate:	
  lots	
  of	
  delta	
  application	
  
work	
  on	
  reads
Merge	
  updates	
  for	
  columns	
  with	
  high	
  update	
  
percentage
name pay role
DiskRowSet(post-­‐compaction)
Delta	
  MS
Unmerged	
  REDO	
  
deltasUNDO	
  deltas
base	
  data
If	
  a	
  column	
  has	
  few	
  updates,	
  doesn’t	
  need	
  to	
  be	
  re-­‐
written:	
  those	
  deltas	
  maintained	
  in	
  new	
  DeltaFile
52©	
  Cloudera,	
  Inc.	
  All	
  rights	
  reserved.
Kudu	
  storage	
  – RowSet Compactions
DRS	
  1	
  (32MB)
[PK=alice],	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  [PK=joe],	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  [PK=linda],	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  [PK=zach]
DRS	
  2	
  (32MB)
[PK=bob],	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  [PK=jon],	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  [PK=mary]	
  	
  	
  	
  	
  	
  	
  	
  	
  [PK=zeke]
DRS	
  3 (32MB)
[PK=carl],	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  [PK=julie],	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  [PK=omar]	
  	
  	
  	
  	
  	
  	
  	
  [PK=zoe]
DRS	
  4	
  (32MB) DRS	
  5	
  (32MB) DRS	
  6	
  (32MB)
[alice,	
  bob,	
  carl,	
  
joe]
[jon,	
  julie,	
  linda,	
  
mary]
[omar,	
  zach,	
  zeke,	
  
zoe]
Reorganize	
  rows	
  to	
  avoid	
  rowsets
with	
  overlapping	
  key	
  ranges

More Related Content

What's hot

Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast DataKudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast DataCloudera, Inc.
 
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in HadoopKudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoopjdcryans
 
Kudu - Fast Analytics on Fast Data
Kudu - Fast Analytics on Fast DataKudu - Fast Analytics on Fast Data
Kudu - Fast Analytics on Fast DataRyan Bosshart
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data PlatformRakuten Group, Inc.
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Data Con LA
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)Todd Lipcon
 
Introducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing MeetupIntroducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing MeetupCaserta
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/KuduChris George
 
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Cloudera, Inc.
 
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and KuduBuilding Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and KuduJeremy Beard
 
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...Data Con LA
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Cloudera, Inc.
 
Impala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for HadoopImpala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for HadoopCloudera, Inc.
 
How Impala Works
How Impala WorksHow Impala Works
How Impala WorksYue Chen
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoopmarkgrover
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impalamarkgrover
 

What's hot (20)

Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast DataKudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
 
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in HadoopKudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
 
Kudu - Fast Analytics on Fast Data
Kudu - Fast Analytics on Fast DataKudu - Fast Analytics on Fast Data
Kudu - Fast Analytics on Fast Data
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
 
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)
 
Introducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing MeetupIntroducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing Meetup
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/Kudu
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
 
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and KuduBuilding Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
 
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
 
Kudu demo
Kudu demoKudu demo
Kudu demo
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Hive vs. Impala
Hive vs. ImpalaHive vs. Impala
Hive vs. Impala
 
Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013Presentations from the Cloudera Impala meetup on Aug 20 2013
Presentations from the Cloudera Impala meetup on Aug 20 2013
 
Impala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for HadoopImpala 2.0 - The Best Analytic Database for Hadoop
Impala 2.0 - The Best Analytic Database for Hadoop
 
How Impala Works
How Impala WorksHow Impala Works
How Impala Works
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoop
 
Introduction to Impala
Introduction to ImpalaIntroduction to Impala
Introduction to Impala
 

Similar to Kudu: Fast Analytics on Fast Data

DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataHakka Labs
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Hadoop / Spark Conference Japan
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016StampedeCon
 
Spark Summit EU talk by Mike Percy
Spark Summit EU talk by Mike PercySpark Summit EU talk by Mike Percy
Spark Summit EU talk by Mike PercySpark Summit
 
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris TsirogiannisImpala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris TsirogiannisFelicia Haggarty
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Apache Kudu - Updatable Analytical Storage #rakutentech
Apache Kudu - Updatable Analytical Storage #rakutentechApache Kudu - Updatable Analytical Storage #rakutentech
Apache Kudu - Updatable Analytical Storage #rakutentechCloudera Japan
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impalahuguk
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaSwiss Big Data User Group
 
Strata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptxStrata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptxManish Maheshwari
 
Strata London 2019 Scaling Impala
Strata London 2019 Scaling ImpalaStrata London 2019 Scaling Impala
Strata London 2019 Scaling ImpalaManish Maheshwari
 
Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)Timothy Spann
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...Cloudera, Inc.
 
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming ArchitecturesCloudera, Inc.
 
A5 oracle exadata-the game changer for online transaction processing data w...
A5   oracle exadata-the game changer for online transaction processing data w...A5   oracle exadata-the game changer for online transaction processing data w...
A5 oracle exadata-the game changer for online transaction processing data w...Dr. Wilfred Lin (Ph.D.)
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsAshish Mrig
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...HostedbyConfluent
 

Similar to Kudu: Fast Analytics on Fast Data (20)

DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
 
SFHUG Kudu Talk
SFHUG Kudu TalkSFHUG Kudu Talk
SFHUG Kudu Talk
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
 
Spark Summit EU talk by Mike Percy
Spark Summit EU talk by Mike PercySpark Summit EU talk by Mike Percy
Spark Summit EU talk by Mike Percy
 
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris TsirogiannisImpala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Apache Kudu - Updatable Analytical Storage #rakutentech
Apache Kudu - Updatable Analytical Storage #rakutentechApache Kudu - Updatable Analytical Storage #rakutentech
Apache Kudu - Updatable Analytical Storage #rakutentech
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Strata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptxStrata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptx
 
Strata London 2019 Scaling Impala
Strata London 2019 Scaling ImpalaStrata London 2019 Scaling Impala
Strata London 2019 Scaling Impala
 
Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
 
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming Architectures
 
A5 oracle exadata-the game changer for online transaction processing data w...
A5   oracle exadata-the game changer for online transaction processing data w...A5   oracle exadata-the game changer for online transaction processing data w...
A5 oracle exadata-the game changer for online transaction processing data w...
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data Platforms
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
 

Recently uploaded

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Christo Ananth
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01KreezheaRecto
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...tanu pandey
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 

Recently uploaded (20)

Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
 
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...Bhosari ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For ...
Bhosari ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For ...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 

Kudu: Fast Analytics on Fast Data

  • 1. 1©  Cloudera,  Inc.  All  rights  reserved. Jean-­‐Daniel  Cryans – jdcryans@cloudera.com @jdcryans Kudu:  Fast  Analytics  on  Fast  Data
  • 2. 2©  Cloudera,  Inc.  All  rights  reserved. Why  Kudu? 2
  • 3. 3©  Cloudera,  Inc.  All  rights  reserved. Current  Storage  Landscape  in  Hadoop  Ecosystem HDFS (GFS)  excels  at: • Batch  ingest  only  (eg hourly) • Efficiently  scanning  large  amounts   of  data  (analytics) HBase (BigTable)  excels  at: • Efficiently  finding  and  writing   individual  rows • Making  data  mutable Gaps  exist  when  these  properties   are  needed  simultaneously
  • 4. 4©  Cloudera,  Inc.  All  rights  reserved. • High  throughput  for  big  scans Goal: Within  2x  of  Parquet • Low-­‐latency  for  short  accesses   Goal: 1ms  read/write  on  SSD • Database-­‐like semantics  (initially  single-­‐row   ACID) • Relational  data  model • SQL  queries  are  easy • “NoSQL”  style  scan/insert/update  (Java/C++  client) Kudu  Design  Goals
  • 5. 5©  Cloudera,  Inc.  All  rights  reserved. Changing  Hardware  landscape • Spinning  disk  -­‐>  solid  state  storage • NAND  flash:  Up  to  450k  read  250k  write  iops,  about  2GB/sec  read  and   1.5GB/sec  write  throughput,at  a  price  of  less  than  $3/GB  and  dropping • 3D  XPoint memory (1000x  faster  than  NAND,  cheaper  than  RAM) • RAM is  cheaper  and  more  abundant: • 64-­‐>128-­‐>256GB  over  last  few  years • Takeaway:  The next  bottleneck  is  CPU,  and  current  storage  systems  weren’t   designed  with  CPU  efficiency  in  mind.
  • 6. 6©  Cloudera,  Inc.  All  rights  reserved. What’s  Kudu? 6
  • 7. 7©  Cloudera,  Inc.  All  rights  reserved. Scalable  and  fast  tabular  storage • Scalable • Tested  up  to  275  nodes  (~3PB  cluster) • Designed  to  scale  to  1000s  of  nodes,  tens  of  PBs • Fast • Millions of  read/write  operations  per  second  across  cluster • Multiple  GB/second  read  throughput  per  node • Tabular • Store  tables  like  a  normal  database  (can  support  SQL,  Spark,  etc) • Individual  record-­‐level  access  to  100+  billion  row  tables  (Java/C++/Python  APIs)
  • 8. 8©  Cloudera,  Inc.  All  rights  reserved. Kudu  Usage • Table  has  a  SQL-­‐like  schema • Finite  number  of  columns  (unlike  HBase/Cassandra) • Types:  BOOL,  INT8,  INT16,  INT32,  INT64,  FLOAT,  DOUBLE,  STRING,  BINARY,   TIMESTAMP • Some  subset  of  columns  makes  up  a  possibly-­‐composite  primary  key • Fast  ALTER  TABLE • Java  and  C++  “NoSQL”  style  APIs • Insert(),  Update(),  Delete(),  Scan() • Integrations  with  MapReduce,  Spark,  and  Impala • Apache  Drill  work-­‐in-­‐progress 8
  • 9. 9©  Cloudera,  Inc.  All  rights  reserved. Use  cases  and  architectures
  • 10. 10©  Cloudera,  Inc.  All  rights  reserved. Kudu  Use  Cases Kudu  is  best  for  use  cases  requiring  a  simultaneous  combination  of sequential  and  random  reads  and  writes ● Time  Series ○ Examples:  Stream  market  data;  fraud  detection  &  prevention;  risk  monitoring ○ Workload:  Insert,  updates,  scans,  lookups ● Machine  Data  Analytics ○ Examples:  Network  threat  detection ○ Workload:  Inserts,  scans,  lookups ● Online  Reporting ○ Examples:  ODS ○ Workload:  Inserts,  updates,  scans,  lookups
  • 11. 11©  Cloudera,  Inc.  All  rights  reserved. Real-­‐Time  Analytics  in  Hadoop  Today Fraud  Detection  in  the  Real  World  =  Storage  Complexity Considerations: ● How  do  I  handle  failure   during   this  process? ● How  often  do  I  reorganize   data  streaming  in  into  a   format  appropriate  for   reporting? ● When  reporting,   how  do  I  see   data  that  has  not  yet  been   reorganized? ● How  do  I  ensure  that   important  jobs  aren’t   interrupted  by  maintenance? New  Partition Most  Recent  Partition Historic  Data HBase Parquet   File Have  we   accumulated   enough  data? Reorganize   HBase  file   into  Parquet • Wait  for  running  operations  to  complete   • Define  new  Impala  partition  referencing   the  newly  written  Parquet  file Incoming  Data   (Messaging   System) Reporting   Request Impala  on  HDFS
  • 12. 12©  Cloudera,  Inc.  All  rights  reserved. Real-­‐Time  Analytics  in  Hadoop  with  Kudu Improvements: ● One  system to  operate ● No  cron  jobs  or  background   processes ● Handle  late  arrivals  or  data   corrections  with  ease ● New  data  available   immediately  for  analytics  or   operations   Historical  and  Real-­‐time Data Incoming  Data   (Messaging   System) Reporting   Request Storage  in  Kudu
  • 13. 13©  Cloudera,  Inc.  All  rights  reserved. Xiaomi  use  case • World’s  4th largest  smart-­‐phone  maker  (most  popular  in  China) • Gather  important  RPC  tracing  events  from  mobile  app  and  backend  service.   • Service  monitoring  &  troubleshooting  tool. u High  write  throughput • >5  Billion  records/day  and  growing u Query  latest  data  and  quick  response • Identify  and  resolve  issues  quickly u Can  search  for  individual  records • Easy  for  troubleshooting
  • 14. 14©  Cloudera,  Inc.  All  rights  reserved. Xiaomi  Big  Data  Analytics  Pipeline Before  Kudu • Long  pipeline high  latency(1  hour  ~  1  day),  data  conversion  pains • No  ordering Log  arrival(storage)  order  not  exactly  logical  order e.g.  read  2-­‐3  days  of  log  for  data  in  1  day
  • 15. 15©  Cloudera,  Inc.  All  rights  reserved. Xiaomi  Big  Data  Analysis  Pipeline Simplified  With  Kudu • ETL  Pipeline(0~10s  latency) Apps  that  need  to  prevent  backpressure  or  require  ETL   • Direct  Pipeline(no  latency) Apps  that  don’t  require  ETL  and  no  backpressure  issues OLAP  scan Side  table  lookup Result  store
  • 16. 16©  Cloudera,  Inc.  All  rights  reserved. How  it  works Replication  and  fault  tolerance 16
  • 17. 17©  Cloudera,  Inc.  All  rights  reserved. Tables  and  Tablets • Table  is  horizontally  partitioned  into  tablets • Range or  hash partitioning • PRIMARY KEY (host, metric, timestamp) DISTRIBUTE BY HASH(timestamp) INTO 100 BUCKETS • bucketNumber = hashCode(row[‘timestamp’]) % 100 • Each  tablet  has  N  replicas  (3  or  5),  with  Raft consensus • Tablet  servers  host  tablets  on  local  disk  drives 17
  • 18. 18©  Cloudera,  Inc.  All  rights  reserved. Metadata • Replicated  master • Acts  as  a  tablet  directory • Acts  as  a  catalog  (which  tables  exist,  etc) • Acts  as  a  load  balancer  (tracks  TS  liveness,  re-­‐replicates  under-­‐replicated   tablets) • Caches  all  metadata in  RAM  for  high  performance • Client  configured  with  master  addresses • Asks  master  for  tablet  locations  as  needed  and  caches  them 18
  • 19. 19©  Cloudera,  Inc.  All  rights  reserved. Client Hey  Master!  Where  is  the  row  for   ‘tlipcon’  in  table  “T”? It’s  part  of  tablet  2,  which  is  on  servers  {Z,Y,X}.   BTW,  here’s  info  on  other  tablets  you  might   care  about:  T1,  T2,  T3,  … UPDATE  tlipcon SET  col=foo Meta  Cache T1:  … T2:  … T3:  …
  • 20. 20©  Cloudera,  Inc.  All  rights  reserved. Raft  consensus 20 TS  A Tablet  1 (LEADER) Client TS  B Tablet  1 (FOLLOWER) TS  C Tablet  1 (FOLLOWER) WAL WALWAL 2b.  Leader  writes  local  WAL 1a.  Client-­‐>Leader:  Write()  RPC 2a.  Leader-­‐>Followers:   UpdateConsensus()  RPC 3.  Follower:  write  WAL 4.  Follower-­‐>Leader:  success 3.  Follower:  write  WAL 5.  Leader  has  achieved  majority 6.  Leader-­‐>Client:  Success!
  • 21. 21©  Cloudera,  Inc.  All  rights  reserved. Fault  tolerance • Transient  failures: • Follower:  Leader  can  still  achieve  majority • Leader:  automatic  leader  election  (~5  seconds) • Restart  dead  TS  within  5  min  and  it  will  rejoin  transparently • Permanent  failures • After  5  minutes,  automatically  creates  a  new  follower  replica  and  copies  data • N  replicas  handle  (N-­‐1)/2  failures 21
  • 22. 22©  Cloudera,  Inc.  All  rights  reserved. How  it  works Columnar  storage 22
  • 23. 23©  Cloudera,  Inc.  All  rights  reserved. Columnar  storage {25059873,   22309487,   23059861,   23010982} Tweet_id {newsycbot,   RideImpala,   fastly,   llvmorg} User_name {1442865158,   1442828307,   1442865156,   1442865155} Created_at {Visual  exp…,   Introducing  ..,   Missing  July…,   LLVM  3.7….} text
  • 24. 24©  Cloudera,  Inc.  All  rights  reserved. Columnar  storage {25059873,   22309487,   23059861,   23010982} Tweet_id {newsycbot,   RideImpala,   fastly,   llvmorg} User_name {1442865158,   1442828307,   1442865156,   1442865155} Created_at {Visual  exp…,   Introducing  ..,   Missing  July…,   LLVM  3.7….} text SELECT  COUNT(*)  FROM  tweets  WHERE  user_name =  ‘newsycbot’; Only  read  1  column   1GB 2GB 1GB 200GB
  • 25. 25©  Cloudera,  Inc.  All  rights  reserved. Columnar  compression {1442865158,   1442828307,   1442865156,   1442865155} Created_at Created_at Diff(created_at) 1442865158 n/a 1442828307 -­‐36851 1442865156 36849 1442865155 -­‐1 64  bits  each 17  bits  each • Many  columns  can  compress  to a  few  bits  per  row! • Especially: • Timestamps • Time  series  values • Low-­‐cardinality  strings • Massive  space  savings  and throughput  increase!
  • 26. 26©  Cloudera,  Inc.  All  rights  reserved. Handling  inserts  and  updates • Inserts  go  to  an  in-­‐memory  row  store  (MemRowSet) • Durable  due  to  write-­‐ahead  logging • Later  flush  to  columnar  format  on  disk • Updates  go  to  in-­‐memory  “delta  store” • Later  flush  to  “delta  files”  on  disk • Eventually  “compact”  into  the  previously-­‐written  columnar  data  files • Details  elided  here  due  to  time  constraints • available  in  other  slide  decks  online,  or  come  find  me  later  to  learn  more!
  • 27. 27©  Cloudera,  Inc.  All  rights  reserved. Integrations
  • 28. 28©  Cloudera,  Inc.  All  rights  reserved. Spark  DataSource integration  (WIP) sqlContext.load("org.kududb.spark", Map("kudu.table" -> “foo”, "kudu.master" -> “master.example.com”)) .registerTempTable(“mytable”) df = sqlContext.sql( “select col_a, col_b from mytable “ + “where col_c = 123”) https://github.com/tmalaska/SparkOnKudu
  • 29. 29©  Cloudera,  Inc.  All  rights  reserved. Impala  integration • CREATE TABLE … DISTRIBUTE BY HASH(col1) INTO 16 BUCKETS AS SELECT … FROM … • INSERT/UPDATE/DELETE • Optimizations  like  predicate  pushdown,  scan  parallelism,  more  on  the   way • Not  an  Impala  user?  Apache  Drill  integration  in  the  works  by  Dremio
  • 30. 30©  Cloudera,  Inc.  All  rights  reserved. MapReduce  integration • Most  Kudu  integration/correctness  testing  via  MapReduce • Multi-­‐framework  cluster  (MR  +  HDFS  +  Kudu  on  the  same  disks) • KuduTableInputFormat /  KuduTableOutputFormat •Support  for  pushing  predicates,  column  projections,  etc
  • 31. 31©  Cloudera,  Inc.  All  rights  reserved. Performance 31
  • 32. 32©  Cloudera,  Inc.  All  rights  reserved. TPC-­‐H  (Analytics  benchmark) • 75  server  cluster • 12  (spinning)  disk  each,  enough  RAM  to  fit  dataset • TPC-­‐H  Scale  Factor  100  (100GB) • Example  query: • SELECT n_name, sum(l_extendedprice * (1 - l_discount)) as revenue FROM customer, orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA' AND o_orderdate >= date '1994-01-01' AND o_orderdate < '1995-01-01’ GROUP BY n_name ORDER BY revenue desc; 32
  • 33. 33©  Cloudera,  Inc.  All  rights  reserved. -­‐ Kudu  outperforms   Parquet  by  31%  (geometric  mean)  for  RAM-­‐resident  data
  • 34. 34©  Cloudera,  Inc.  All  rights  reserved. Versus  other  NoSQL  storage • Phoenix:  SQL  layer  on  HBase • 10  node  cluster  (9  worker,  1  master) • TPC-­‐H  LINEITEM  table  only  (6B  rows) 34 2152 219 76 131 0.04 1918 13.2 1.7 0.7 0.15 155 9.3 1.4 1.5 1.37 0.01 0.1 1 10 100 1000 10000 Load TPCH  Q1 COUNT(*) COUNT(*) WHERE… single-­‐row lookup Time  (sec) Phoenix Kudu Parquet
  • 35. 35©  Cloudera,  Inc.  All  rights  reserved. Xiaomi  benchmark • 6  real  queries  from  application  trace  analysis  application • Q1:  SELECT  COUNT(*) • Q2:  SELECT  hour,  COUNT(*)  WHERE  module  =  ‘foo’  GROUP  BY  HOUR • Q3:  SELECT  hour,  COUNT(DISTINCT  uid)  WHERE  module  =  ‘foo’  AND  app=‘bar’   GROUP  BY  HOUR • Q4:  analytics  on  RPC  success  rate  over  all  data  for  one  app • Q5:  same  as  Q4,  but  filter  by  time  range • Q6:  SELECT  *  WHERE  app  =  …  AND  uid =  …  ORDER  BY  ts LIMIT  30  OFFSET  30
  • 36. 36©  Cloudera,  Inc.  All  rights  reserved. Xiaomi:  Benchmark  Results 1.4   2.0   2.3   3.1   1.3   0.9  1.3   2.8   4.0   5.7   7.5   16.7   Q1 Q2 Q3 Q4 Q5 Q6 kudu parquet Query  latency: *  HDFS  parquet  file  replication  =  3  ,  kudu  table  replication  =  3 *  Each  query  run  5  times  then  take  average
  • 37. 37©  Cloudera,  Inc.  All  rights  reserved. What  about  NoSQL-­‐style  random  access?  (YCSB) • YCSB 0.5.0-­‐snapshot • 10  node  cluster (9  worker,  1  master) • 100M  row  data  set • 10M  operations  each workload 37
  • 38. 38©  Cloudera,  Inc.  All  rights  reserved. Getting  started 38
  • 39. 39©  Cloudera,  Inc.  All  rights  reserved. Project  status • Open  source  beta  released  in  September  2015 • First  update  version  0.6.0  released  last  month • Usable  for  many  applications • Have  not  experienced  data  loss,  reasonably  stable  (almost  no  crashes  reported) • Still  requires  some  expert  assistance,  and  you’ll  probably  find  some  bugs • Officially  now  known  as  Apache  Kudu(incubating)!
  • 40. 40©  Cloudera,  Inc.  All  rights  reserved. Kudu  Community Your  company  here!
  • 41. 41©  Cloudera,  Inc.  All  rights  reserved. Getting  started  as  a  user • http://getkudu.io • kudu-­‐user@googlegroups.com • http://getkudu-­‐slack.herokuapp.com/ • Quickstart VM • Easiest  way  to  get  started • Impala  and  Kudu  in  an  easy-­‐to-­‐install  VM • CSD  and  Parcels • For  installation  on  a  Cloudera  Manager-­‐managed  cluster 41
  • 42. 42©  Cloudera,  Inc.  All  rights  reserved. Getting  started  as  a  developer • http://github.com/cloudera/kudu • All  commits  go  here  first • Public  gerrit:  http://gerrit.cloudera.org • All  code  reviews  happening  here • Public  JIRA:  http://issues.cloudera.org • Includes  bugs  going  back  to  2013.  Come  see  our  dirty  laundry! • kudu-­‐dev@googlegroups.com • Apache  2.0  license  open  source • Contributions  are  welcome  and  encouraged! 42
  • 43. 43©  Cloudera,  Inc.  All  rights  reserved. http://getkudu.io/ @apachekudu
  • 44. 44©  Cloudera,  Inc.  All  rights  reserved. How  it  works Write  and  read  paths 44
  • 45. 45©  Cloudera,  Inc.  All  rights  reserved. Kudu  storage  – Inserts  and  Flushes 45 MemRowSet INSERT(“todd”,  “$1000”,”engineer”) name pay role DiskRowSet 1 flush
  • 46. 46©  Cloudera,  Inc.  All  rights  reserved. Kudu  storage  – Inserts  and  Flushes 46 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 INSERT(“doug”,  “$1B”,  “Hadoop  man”) flush
  • 47. 47©  Cloudera,  Inc.  All  rights  reserved. Kudu  storage  -­‐ Updates 47 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta  MS Delta  MS Each  DiskRowSet has  its  own   DeltaMemStore to   accumulate  updates base  data base  data
  • 48. 48©  Cloudera,  Inc.  All  rights  reserved. Kudu  storage  -­‐ Updates 48 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta  MS Delta  MS UPDATE  set  pay=“$1M”   WHERE  name=“todd” Is  the  row  in  DiskRowSet 2? (check  bloom  filters) Is  the  row  in  DiskRowSet 1? (check  bloom  filters) Bloom  says:  no! Bloom  says:  maybe! Search  key  column  to  find   offset:  rowid =  150 150:  pay=$1M @  time  T base  data
  • 49. 49©  Cloudera,  Inc.  All  rights  reserved. Kudu  storage  – Read  path 49 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta  MS Delta  MS 150:  pay=$1M   @  time  T Read  rows  in  DiskRowSet 2 Then,  read  rows  in   DiskRowSet 1 Updates  are  applied  based  on  ordinal   offset  within  DRS:  array  indexing  =  fast base  data base  data
  • 50. 50©  Cloudera,  Inc.  All  rights  reserved. Kudu  storage  – Delta  flushes 50 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta  MS Delta  MS 150:  pay=$1M   @  time  T REDO  DeltaFile Flush A  REDO delta  indicates  how  to   transform  between  the  ‘base  data’   (columnar)  and  a  later  version base  data base  data
  • 51. 51©  Cloudera,  Inc.  All  rights  reserved. Kudu  storage  – Major  delta  compaction 51 name pay role DiskRowSet(pre-­‐compaction) Delta  MS REDO  DeltaFile REDO  DeltaFile REDO  DeltaFile Many  deltas  accumulate:  lots  of  delta  application   work  on  reads Merge  updates  for  columns  with  high  update   percentage name pay role DiskRowSet(post-­‐compaction) Delta  MS Unmerged  REDO   deltasUNDO  deltas base  data If  a  column  has  few  updates,  doesn’t  need  to  be  re-­‐ written:  those  deltas  maintained  in  new  DeltaFile
  • 52. 52©  Cloudera,  Inc.  All  rights  reserved. Kudu  storage  – RowSet Compactions DRS  1  (32MB) [PK=alice],                              [PK=joe],                            [PK=linda],                      [PK=zach] DRS  2  (32MB) [PK=bob],                              [PK=jon],                            [PK=mary]                  [PK=zeke] DRS  3 (32MB) [PK=carl],                              [PK=julie],                        [PK=omar]                [PK=zoe] DRS  4  (32MB) DRS  5  (32MB) DRS  6  (32MB) [alice,  bob,  carl,   joe] [jon,  julie,  linda,   mary] [omar,  zach,  zeke,   zoe] Reorganize  rows  to  avoid  rowsets with  overlapping  key  ranges