SlideShare a Scribd company logo
1 of 50
© Cloudera, Inc. All rights reserved. 1
Todd Lipcon (Kudu team lead) – todd@cloudera.com
@tlipcon
Tweet about this talk: @apachekudu or #kudu
* Incubating at the Apache Software Foundation
Apache Kudu*: Fast Analytics on
Fast Data
© Cloudera, Inc. All rights reserved. 2
Apache Kudu
Storage for Fast Analytics on Fast Data
• New updatable column
store for Hadoop
• Apache-licensed open
source
• Beta now available
Columnar Store
Kudu
© Cloudera, Inc. All rights reserved. 3
Why Kudu?
3
© Cloudera, Inc. All rights reserved. 4
Current Storage Landscape in Hadoop Ecosystem
HDFS (GFS) excels at:
• Batch ingest only (eg hourly)
• Efficiently scanning large amounts of
data (analytics)
HBase (BigTable) excels at:
• Efficiently finding and writing
individual rows
• Making data mutable
Gaps exist when these properties
are needed simultaneously
© Cloudera, Inc. All rights reserved. 5
• High throughput for big scans
Goal: Within 2x of Parquet
• Low-latency for short accesses
Goal: 1ms read/write on SSD
• Database-like semantics (initially single-row
ACID)
• Relational data model
• SQL queries are easy
• “NoSQL” style scan/insert/update (Java/C++ client)
Kudu Design Goals
© Cloudera, Inc. All rights reserved. 6
Changing Hardware landscape
• Spinning disk -> solid state storage
• NAND flash: Up to 450k read 250k write iops, about 2GB/sec read and
1.5GB/sec write throughput, at a price of less than $3/GB and dropping
• 3D XPoint memory (1000x faster than NAND, cheaper than RAM)
• RAM is cheaper and more abundant:
• 64->128->256GB over last few years
• Takeaway: The next bottleneck is CPU, and current storage systems weren’t
designed with CPU efficiency in mind.
© Cloudera, Inc. All rights reserved. 7
What’s Kudu?
7
© Cloudera, Inc. All rights reserved. 8
Scalable and Fast Tabular Storage
• Scalable
• Tested up to 275 nodes (~3PB cluster)
• Designed to scale to 1000s of nodes, tens of PBs
• Fast
• Millions of read/write operations per second across cluster
• Multiple GB/second read throughput per node
• Tabular
• SQL-like schema: finite number of typed columns (unlike HBase/Cassandra)
• Fast ALTER TABLE
• “NoSQL” APIs: Java/C++/Python or SQL (Impala/Spark/etc)
© Cloudera, Inc. All rights reserved. 9
Use cases and architectures
© Cloudera, Inc. All rights reserved. 10
Kudu Use Cases
Kudu is best for use cases requiring a simultaneous combination of
sequential and random reads and writes
● Time Series
○ Examples: Stream market data; fraud detection & prevention; network monitoring
○ Workload: Insert, updates, scans, lookups
● Online Reporting
○ Examples: ODS
○ Workload: Inserts, updates, scans, lookups
© Cloudera, Inc. All rights reserved. 11
Real-Time Analytics in Hadoop Today
Fraud Detection in the Real World = Storage Complexity
Considerations:
● How do I handle failure
during this process?
● How often do I reorganize
data streaming in into a
format appropriate for
reporting?
● When reporting, how do I see
data that has not yet been
reorganized?
● How do I ensure that
important jobs aren’t
interrupted by maintenance?
New Partition
Most Recent Partition
Historic Data
HBase
Parquet
File
Have we
accumulated
enough data?
Reorganize
HBase file
into Parquet
• Wait for running operations to complete
• Define new Impala partition referencing
the newly written Parquet file
Incoming Data
(Messaging
System)
Reporting
Request
Impala on HDFS
© Cloudera, Inc. All rights reserved. 12
Real-Time Analytics in Hadoop with Kudu
Improvements:
● One system to operate
● No cron jobs or background
processes
● Handle late arrivals or data
corrections with ease
● New data available
immediately for analytics or
operations
Historical and Real-time
Data
Incoming Data
(Messaging
System)
Reporting
Request
Storage in Kudu
© Cloudera, Inc. All rights reserved. 13
Xiaomi Use Case
• World’s 4th largest smart-phone maker (most popular in China)
• Gather important RPC tracing events from mobile app and backend service.
• Service monitoring & troubleshooting tool.
 High write throughput
• >5 Billion records/day and growing
 Query latest data and quick response
• Identify and resolve issues quickly
 Can search for individual records
• Easy for troubleshooting
© Cloudera, Inc. All rights reserved. 14
Xiaomi Big Data Analytics Pipeline
Before Kudu
• Long pipeline
high latency(1 hour ~ 1 day), data conversion pains
• No ordering
Log arrival(storage) order not exactly logical order
e.g. read 2-3 days of log for data in 1 day
© Cloudera, Inc. All rights reserved. 15
Xiaomi Big Data Analysis Pipeline
Simplified With Kudu
• ETL Pipeline(0~10s latency)
Apps that need to prevent backpressure or require ETL
• Direct Pipeline(no latency)
Apps that don’t require ETL and no backpressure issues
OLAP scan
Side table lookup
Result store
© Cloudera, Inc. All rights reserved. 16
How it Works
Replication and fault tolerance
16
© Cloudera, Inc. All rights reserved. 17
Tables, Tablets, and Tablet Servers
• Table is horizontally partitioned into tablets
• Range or hash partitioning
• PRIMARY KEY (host, metric, timestamp) DISTRIBUTE BY
HASH(timestamp) INTO 100 BUCKETS
• bucketNumber = hashCode(row[‘timestamp’]) % 100
• Each tablet has N replicas (3 or 5), with Raft consensus
• Automatic fault tolerance
• MTTR: ~5 seconds
• Tablet servers host tablets on local disk drives
17
© Cloudera, Inc. All rights reserved. 18
Metadata and the Master
• Replicated master
• Acts as a tablet directory
• Acts as a catalog (which tables exist, etc)
• Acts as a load balancer (tracks TS liveness, re-replicates under-replicated
tablets)
• Not a bottleneck
• super fast in-memory lookups
18
© Cloudera, Inc. All rights reserved. 19
Client
Hey Master! Where is the row for
‘tlipcon’ in table “T”?
It’s part of tablet 2, which is on servers {Z,Y,X}.
BTW, here’s info on other tablets you might
care about: T1, T2, T3, …
UPDATE tlipcon
SET col=foo
Meta Cache
T1: …
T2: …
T3: …
© Cloudera, Inc. All rights reserved. 20
Raft Consensus
20
TS A
Tablet 1
(LEADER)
Client
TS B
Tablet 1
(FOLLOWER)
TS C
Tablet 1
(FOLLOWER)
WAL
WALWAL
2b. Leader writes local WAL
1a. Client->Leader: Write() RPC
2a. Leader->Followers:
UpdateConsensus() RPC
3. Follower: write WAL
4. Follower->Leader: success
3. Follower: write WAL
5. Leader has achieved majority
6. Leader->Client: Success!
© Cloudera, Inc. All rights reserved. 21
How it Works
Columnar storage
21
© Cloudera, Inc. All rights reserved. 22
Columnar Storage
{25059873,
22309487,
23059861,
23010982}
Tweet_id
{newsycbot,
RideImpala,
fastly,
llvmorg}
User_name
{1442865158,
1442828307,
1442865156,
1442865155}
Created_at
{Visual exp…,
Introducing ..,
Missing July…,
LLVM 3.7….}
text
© Cloudera, Inc. All rights reserved. 23
Columnar Storage
{25059873,
22309487,
23059861,
23010982}
Tweet_id
{newsycbot,
RideImpala,
fastly,
llvmorg}
User_name
{1442865158,
1442828307,
1442865156,
1442865155}
Created_at
{Visual exp…,
Introducing ..,
Missing July…,
LLVM 3.7….}
text
SELECT COUNT(*) FROM tweets WHERE user_name = ‘newsycbot’;
Only read 1 column
1GB 2GB 1GB 200GB
© Cloudera, Inc. All rights reserved. 24
Columnar Compression
{1442865158,
1442828307,
1442865156,
1442865155}
Created_at
Created_at Diff(created_at)
1442865158 n/a
1442828307 -36851
1442865156 36849
1442865155 -1
64 bits each 17 bits each
• Many columns can compress to
a few bits per row!
• Especially:
• Timestamps
• Time series values
• Low-cardinality strings
• Massive space savings and
throughput increase!
© Cloudera, Inc. All rights reserved. 25
Handling Inserts and Updates
• Inserts go to an in-memory row store (MemRowSet)
• Durable due to write-ahead logging
• Later flush to columnar format on disk
• Updates go to in-memory “delta store”
• Later flush to “delta files” on disk
• Eventually “compact” into the previously-written columnar data files
• Details elided here due to time constraints
• available in other slide decks online, or come to office hours to learn more!
© Cloudera, Inc. All rights reserved. 26
Integrations
© Cloudera, Inc. All rights reserved. 27
Spark DataSource Integration (WIP)
sqlContext.load("org.kududb.spark",
Map("kudu.table" -> “foo”,
"kudu.master" -> “master.example.com”))
.registerTempTable(“mytable”)
df = sqlContext.sql(
“select col_a, col_b from mytable “ +
“where col_c = 123”)
Available in Kudu 0.7.0, but still being improved
© Cloudera, Inc. All rights reserved. 28
Impala Integration
• CREATE TABLE … DISTRIBUTE BY HASH(col1) INTO 16
BUCKETS AS SELECT … FROM …
• INSERT/UPDATE/DELETE
• Optimizations like predicate pushdown, scan parallelism, more on the
way
• Not an Impala user? Community working on other integrations (Hive,
Drill, Presto, Phoenix)
© Cloudera, Inc. All rights reserved. 29
MapReduce Integration
• Multi-framework cluster (MR + HDFS + Kudu on the same disks)
• KuduTableInputFormat / KuduTableOutputFormat
• Support for pushing predicates, column projections, etc
© Cloudera, Inc. All rights reserved. 30
Performance
30
© Cloudera, Inc. All rights reserved. 31
TPC-H (Analytics benchmark)
• 75 server cluster
• 12 (spinning) disk each, enough RAM to fit dataset
• TPC-H Scale Factor 100 (100GB)
• Example query:
• SELECT n_name, sum(l_extendedprice * (1 - l_discount)) as revenue FROM customer,
orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND
l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey
AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA'
AND o_orderdate >= date '1994-01-01' AND o_orderdate < '1995-01-01’ GROUP BY
n_name ORDER BY revenue desc;
31
© Cloudera, Inc. All rights reserved. 32
- Kudu outperforms Parquet by 31% (geometric mean) for RAM-resident data
© Cloudera, Inc. All rights reserved. 33
Versus other NoSQL Storage
• Phoenix: SQL layer on HBase
• 10 node cluster (9 worker, 1 master)
• TPC-H LINEITEM table only (6B rows)
33
2152
219
76
131
0
1918.
13.2
1.7
0.7
0.2
155.
9.3
1.4 1.5 1.4
0.
0.1
1.
10.
100.
1000.
10000.
Load TPCH Q1 COUNT(*) COUNT(*)
WHERE…
single-row
lookup
Time(sec)
Phoenix
Kudu
Parquet
© Cloudera, Inc. All rights reserved. 34
What about NoSQL-style Random Access? (YCSB)
• YCSB 0.5.0-snapshot
• 10 node cluster
(9 worker, 1 master)
• 100M row data set
• 10M operations each
workload
34
© Cloudera, Inc. All rights reserved. 35
Getting started
35
© Cloudera, Inc. All rights reserved. 36
Project Status
• Open source beta released in September
• Latest release 0.7.1 hot off the presses
• Usable for many applications
• Have not experienced unrecoverable data loss, reasonably stable (almost no
crashes reported). Users testing up to 200 nodes so far.
• Still requires some expert assistance, and you’ll probably find some bugs
• Part of the Apache Software Foundation Incubator
• Community-driven open source process
© Cloudera, Inc. All rights reserved. 37
Apache Kudu Community
© Cloudera, Inc. All rights reserved. 38
Getting Started As a User
• http://getkudu.io
• user@kudu.incubator.apache.org
• http://getkudu-slack.herokuapp.com/
• Quickstart VM
• Easiest way to get started
• Impala and Kudu in an easy-to-install VM
• CSD and Parcels
• For installation on a Cloudera Manager-managed cluster
38
© Cloudera, Inc. All rights reserved. 39
Getting Started As a Developer
• http://github.com/apache/incubator-kudu
• Code reviews: http://gerrit.cloudera.org
• Public JIRA: http://issues.apache.org/jira/browse/KUDU
• Includes bugs going back to 2013. Come see our dirty laundry!
• Mailing list: dev@kudu.incubator.apache.org
• Apache 2.0 license open source
• Contributions are welcome and encouraged!
39
© Cloudera, Inc. All rights reserved. 40
http://getkudu.io/
@getkudu
© Cloudera, Inc. All rights reserved. 41
Backup slides
© Cloudera, Inc. All rights reserved. 42
How it works
Write and read paths
42
© Cloudera, Inc. All rights reserved. 43
Kudu storage – Inserts and Flushes
43
MemRowSet
INSERT(“todd”, “$1000”,”engineer”)
name pay role
DiskRowSet 1
flush
© Cloudera, Inc. All rights reserved. 44
Kudu storage – Inserts and Flushes
44
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
INSERT(“doug”, “$1B”, “Hadoop man”)
flush
© Cloudera, Inc. All rights reserved. 45
Kudu storage - Updates
45
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta MS
Delta MS
Each DiskRowSet has its own
DeltaMemStore to accumulate
updates
base data
base data
© Cloudera, Inc. All rights reserved. 46
Kudu storage - Updates
46
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta MS
Delta MS
UPDATE set pay=“$1M”
WHERE name=“todd”
Is the row in DiskRowSet 2?
(check bloom filters)
Is the row in DiskRowSet 1?
(check bloom filters)
Bloom says: no!
Bloom says: maybe!
Search key column to find offset:
rowid = 150
150: pay=$1M
@ time T
base data
© Cloudera, Inc. All rights reserved. 47
Kudu storage – Read path
47
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta MS
Delta MS
150: pay=$1M
@ time T
Read rows in DiskRowSet 2
Then, read rows in
DiskRowSet 1
Updates are applied based on ordinal
offset within DRS: array indexing = fast
base data
base data
© Cloudera, Inc. All rights reserved. 48
Kudu storage – Delta flushes
48
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta MS
Delta MS
150: pay=$1M
@ time T
REDO DeltaFile
Flush
A REDO delta indicates how to
transform between the ‘base data’
(columnar) and a later version
base data
base data
© Cloudera, Inc. All rights reserved. 49
Kudu storage – Major delta compaction
49
name pay role
DiskRowSet(pre-compaction)
Delta MS
REDO DeltaFile REDO DeltaFile REDO DeltaFile
Many deltas accumulate: lots of delta application
work on reads
Merge updates for columns with high update
percentage
name pay role
DiskRowSet(post-compaction)
Delta MS
Unmerged REDO
deltasUNDO deltas
base data
If a column has few updates, doesn’t need to be re-
written: those deltas maintained in new DeltaFile
© Cloudera, Inc. All rights reserved. 50
Kudu storage – RowSet Compactions
DRS 1 (32MB)
[PK=alice], [PK=joe], [PK=linda], [PK=zach]
DRS 2 (32MB)
[PK=bob], [PK=jon], [PK=mary] [PK=zeke]
DRS 3 (32MB)
[PK=carl], [PK=julie], [PK=omar] [PK=zoe]
DRS 4 (32MB) DRS 5 (32MB) DRS 6 (32MB)
[alice, bob, carl,
joe]
[jon, julie, linda,
mary]
[omar, zach, zeke,
zoe]
Reorganize rows to avoid rowsets
with overlapping key ranges

More Related Content

What's hot

Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
DataWorks Summit
 

What's hot (20)

A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Introducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing MeetupIntroducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing Meetup
 
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in HadoopKudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
 
Introducing Kudu
Introducing KuduIntroducing Kudu
Introducing Kudu
 
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and KuduBuilding Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
 
Exponea - Kafka and Hadoop as components of architecture
Exponea  - Kafka and Hadoop as components of architectureExponea  - Kafka and Hadoop as components of architecture
Exponea - Kafka and Hadoop as components of architecture
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Apache kudu
Apache kuduApache kudu
Apache kudu
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Apache Flink & Kudu: a connector to develop Kappa architectures
Apache Flink & Kudu: a connector to develop Kappa architecturesApache Flink & Kudu: a connector to develop Kappa architectures
Apache Flink & Kudu: a connector to develop Kappa architectures
 
Enabling the Active Data Warehouse with Apache Kudu
Enabling the Active Data Warehouse with Apache KuduEnabling the Active Data Warehouse with Apache Kudu
Enabling the Active Data Warehouse with Apache Kudu
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
 
High concurrency,
Low latency analytics
using Spark/Kudu
 High concurrency,
Low latency analytics
using Spark/Kudu High concurrency,
Low latency analytics
using Spark/Kudu
High concurrency,
Low latency analytics
using Spark/Kudu
 
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
 
Kudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast DataKudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast Data
 
Application Architectures with Hadoop
Application Architectures with HadoopApplication Architectures with Hadoop
Application Architectures with Hadoop
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Kudu demo
Kudu demoKudu demo
Kudu demo
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
 

Viewers also liked

Кирилл Алешин - Big Data и Lambda архитектура на практике
Кирилл Алешин - Big Data и Lambda архитектура на практикеКирилл Алешин - Big Data и Lambda архитектура на практике
Кирилл Алешин - Big Data и Lambda архитектура на практике
IT Share
 
Como utilizar las redes sociales
Como utilizar las redes socialesComo utilizar las redes sociales
Como utilizar las redes sociales
mberaliz
 
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, ClouderaSolr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Lucidworks
 

Viewers also liked (20)

Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast DataKudu: New Hadoop Storage for Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
 
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
 
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop EcosystemWhy Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadoop
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Inside HDFS Append
Inside HDFS AppendInside HDFS Append
Inside HDFS Append
 
Impala Architecture presentation
Impala Architecture presentationImpala Architecture presentation
Impala Architecture presentation
 
Apache Spark at Viadeo
Apache Spark at ViadeoApache Spark at Viadeo
Apache Spark at Viadeo
 
Кирилл Алешин - Big Data и Lambda архитектура на практике
Кирилл Алешин - Big Data и Lambda архитектура на практикеКирилл Алешин - Big Data и Lambda архитектура на практике
Кирилл Алешин - Big Data и Lambda архитектура на практике
 
Parquet and impala overview external
Parquet and impala overview externalParquet and impala overview external
Parquet and impala overview external
 
Real-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using ImpalaReal-time Big Data Analytics Engine using Impala
Real-time Big Data Analytics Engine using Impala
 
Apache Spark: Usage and Roadmap in Hadoop
Apache Spark: Usage and Roadmap in HadoopApache Spark: Usage and Roadmap in Hadoop
Apache Spark: Usage and Roadmap in Hadoop
 
Cloudera Impala Internals
Cloudera Impala InternalsCloudera Impala Internals
Cloudera Impala Internals
 
Cloudera Impala Source Code Explanation and Analysis
Cloudera Impala Source Code Explanation and AnalysisCloudera Impala Source Code Explanation and Analysis
Cloudera Impala Source Code Explanation and Analysis
 
Hadoop Application Architectures - Fraud Detection
Hadoop Application Architectures - Fraud  DetectionHadoop Application Architectures - Fraud  Detection
Hadoop Application Architectures - Fraud Detection
 
Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1Hdfs 2016-hadoop-summit-dublin-v1
Hdfs 2016-hadoop-summit-dublin-v1
 
Como utilizar las redes sociales
Como utilizar las redes socialesComo utilizar las redes sociales
Como utilizar las redes sociales
 
SecPod: A Framework for Virtualization-based Security Systems
SecPod: A Framework for Virtualization-based Security SystemsSecPod: A Framework for Virtualization-based Security Systems
SecPod: A Framework for Virtualization-based Security Systems
 
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, ClouderaSolr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
Solr on HDFS - Past, Present, and Future: Presented by Mark Miller, Cloudera
 

Similar to Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data by Todd Lipcon, Software Engineer, Cloudera / Kudu Founder

Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Cloudera, Inc.
 

Similar to Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data by Todd Lipcon, Software Engineer, Cloudera / Kudu Founder (20)

Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
 
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast DataDatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 
Spark Summit EU talk by Mike Percy
Spark Summit EU talk by Mike PercySpark Summit EU talk by Mike Percy
Spark Summit EU talk by Mike Percy
 
SFHUG Kudu Talk
SFHUG Kudu TalkSFHUG Kudu Talk
SFHUG Kudu Talk
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
Impala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris TsirogiannisImpala tech-talk by Dimitris Tsirogiannis
Impala tech-talk by Dimitris Tsirogiannis
 
Apache Kudu - Updatable Analytical Storage #rakutentech
Apache Kudu - Updatable Analytical Storage #rakutentechApache Kudu - Updatable Analytical Storage #rakutentech
Apache Kudu - Updatable Analytical Storage #rakutentech
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
 
Building a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with ImpalaBuilding a Hadoop Data Warehouse with Impala
Building a Hadoop Data Warehouse with Impala
 
Strata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptxStrata London 2019 Scaling Impala.pptx
Strata London 2019 Scaling Impala.pptx
 
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
Simplifying Hadoop with RecordService, A Secure and Unified Data Access Path ...
 
Spark etl
Spark etlSpark etl
Spark etl
 
Strata London 2019 Scaling Impala
Strata London 2019 Scaling ImpalaStrata London 2019 Scaling Impala
Strata London 2019 Scaling Impala
 
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
Spark Streaming& Kafka-The Future of Stream Processing by Hari Shreedharan of...
 
Spark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream ProcessingSpark Streaming & Kafka-The Future of Stream Processing
Spark Streaming & Kafka-The Future of Stream Processing
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 

Recently uploaded (20)

MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 

Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data by Todd Lipcon, Software Engineer, Cloudera / Kudu Founder

  • 1. © Cloudera, Inc. All rights reserved. 1 Todd Lipcon (Kudu team lead) – todd@cloudera.com @tlipcon Tweet about this talk: @apachekudu or #kudu * Incubating at the Apache Software Foundation Apache Kudu*: Fast Analytics on Fast Data
  • 2. © Cloudera, Inc. All rights reserved. 2 Apache Kudu Storage for Fast Analytics on Fast Data • New updatable column store for Hadoop • Apache-licensed open source • Beta now available Columnar Store Kudu
  • 3. © Cloudera, Inc. All rights reserved. 3 Why Kudu? 3
  • 4. © Cloudera, Inc. All rights reserved. 4 Current Storage Landscape in Hadoop Ecosystem HDFS (GFS) excels at: • Batch ingest only (eg hourly) • Efficiently scanning large amounts of data (analytics) HBase (BigTable) excels at: • Efficiently finding and writing individual rows • Making data mutable Gaps exist when these properties are needed simultaneously
  • 5. © Cloudera, Inc. All rights reserved. 5 • High throughput for big scans Goal: Within 2x of Parquet • Low-latency for short accesses Goal: 1ms read/write on SSD • Database-like semantics (initially single-row ACID) • Relational data model • SQL queries are easy • “NoSQL” style scan/insert/update (Java/C++ client) Kudu Design Goals
  • 6. © Cloudera, Inc. All rights reserved. 6 Changing Hardware landscape • Spinning disk -> solid state storage • NAND flash: Up to 450k read 250k write iops, about 2GB/sec read and 1.5GB/sec write throughput, at a price of less than $3/GB and dropping • 3D XPoint memory (1000x faster than NAND, cheaper than RAM) • RAM is cheaper and more abundant: • 64->128->256GB over last few years • Takeaway: The next bottleneck is CPU, and current storage systems weren’t designed with CPU efficiency in mind.
  • 7. © Cloudera, Inc. All rights reserved. 7 What’s Kudu? 7
  • 8. © Cloudera, Inc. All rights reserved. 8 Scalable and Fast Tabular Storage • Scalable • Tested up to 275 nodes (~3PB cluster) • Designed to scale to 1000s of nodes, tens of PBs • Fast • Millions of read/write operations per second across cluster • Multiple GB/second read throughput per node • Tabular • SQL-like schema: finite number of typed columns (unlike HBase/Cassandra) • Fast ALTER TABLE • “NoSQL” APIs: Java/C++/Python or SQL (Impala/Spark/etc)
  • 9. © Cloudera, Inc. All rights reserved. 9 Use cases and architectures
  • 10. © Cloudera, Inc. All rights reserved. 10 Kudu Use Cases Kudu is best for use cases requiring a simultaneous combination of sequential and random reads and writes ● Time Series ○ Examples: Stream market data; fraud detection & prevention; network monitoring ○ Workload: Insert, updates, scans, lookups ● Online Reporting ○ Examples: ODS ○ Workload: Inserts, updates, scans, lookups
  • 11. © Cloudera, Inc. All rights reserved. 11 Real-Time Analytics in Hadoop Today Fraud Detection in the Real World = Storage Complexity Considerations: ● How do I handle failure during this process? ● How often do I reorganize data streaming in into a format appropriate for reporting? ● When reporting, how do I see data that has not yet been reorganized? ● How do I ensure that important jobs aren’t interrupted by maintenance? New Partition Most Recent Partition Historic Data HBase Parquet File Have we accumulated enough data? Reorganize HBase file into Parquet • Wait for running operations to complete • Define new Impala partition referencing the newly written Parquet file Incoming Data (Messaging System) Reporting Request Impala on HDFS
  • 12. © Cloudera, Inc. All rights reserved. 12 Real-Time Analytics in Hadoop with Kudu Improvements: ● One system to operate ● No cron jobs or background processes ● Handle late arrivals or data corrections with ease ● New data available immediately for analytics or operations Historical and Real-time Data Incoming Data (Messaging System) Reporting Request Storage in Kudu
  • 13. © Cloudera, Inc. All rights reserved. 13 Xiaomi Use Case • World’s 4th largest smart-phone maker (most popular in China) • Gather important RPC tracing events from mobile app and backend service. • Service monitoring & troubleshooting tool.  High write throughput • >5 Billion records/day and growing  Query latest data and quick response • Identify and resolve issues quickly  Can search for individual records • Easy for troubleshooting
  • 14. © Cloudera, Inc. All rights reserved. 14 Xiaomi Big Data Analytics Pipeline Before Kudu • Long pipeline high latency(1 hour ~ 1 day), data conversion pains • No ordering Log arrival(storage) order not exactly logical order e.g. read 2-3 days of log for data in 1 day
  • 15. © Cloudera, Inc. All rights reserved. 15 Xiaomi Big Data Analysis Pipeline Simplified With Kudu • ETL Pipeline(0~10s latency) Apps that need to prevent backpressure or require ETL • Direct Pipeline(no latency) Apps that don’t require ETL and no backpressure issues OLAP scan Side table lookup Result store
  • 16. © Cloudera, Inc. All rights reserved. 16 How it Works Replication and fault tolerance 16
  • 17. © Cloudera, Inc. All rights reserved. 17 Tables, Tablets, and Tablet Servers • Table is horizontally partitioned into tablets • Range or hash partitioning • PRIMARY KEY (host, metric, timestamp) DISTRIBUTE BY HASH(timestamp) INTO 100 BUCKETS • bucketNumber = hashCode(row[‘timestamp’]) % 100 • Each tablet has N replicas (3 or 5), with Raft consensus • Automatic fault tolerance • MTTR: ~5 seconds • Tablet servers host tablets on local disk drives 17
  • 18. © Cloudera, Inc. All rights reserved. 18 Metadata and the Master • Replicated master • Acts as a tablet directory • Acts as a catalog (which tables exist, etc) • Acts as a load balancer (tracks TS liveness, re-replicates under-replicated tablets) • Not a bottleneck • super fast in-memory lookups 18
  • 19. © Cloudera, Inc. All rights reserved. 19 Client Hey Master! Where is the row for ‘tlipcon’ in table “T”? It’s part of tablet 2, which is on servers {Z,Y,X}. BTW, here’s info on other tablets you might care about: T1, T2, T3, … UPDATE tlipcon SET col=foo Meta Cache T1: … T2: … T3: …
  • 20. © Cloudera, Inc. All rights reserved. 20 Raft Consensus 20 TS A Tablet 1 (LEADER) Client TS B Tablet 1 (FOLLOWER) TS C Tablet 1 (FOLLOWER) WAL WALWAL 2b. Leader writes local WAL 1a. Client->Leader: Write() RPC 2a. Leader->Followers: UpdateConsensus() RPC 3. Follower: write WAL 4. Follower->Leader: success 3. Follower: write WAL 5. Leader has achieved majority 6. Leader->Client: Success!
  • 21. © Cloudera, Inc. All rights reserved. 21 How it Works Columnar storage 21
  • 22. © Cloudera, Inc. All rights reserved. 22 Columnar Storage {25059873, 22309487, 23059861, 23010982} Tweet_id {newsycbot, RideImpala, fastly, llvmorg} User_name {1442865158, 1442828307, 1442865156, 1442865155} Created_at {Visual exp…, Introducing .., Missing July…, LLVM 3.7….} text
  • 23. © Cloudera, Inc. All rights reserved. 23 Columnar Storage {25059873, 22309487, 23059861, 23010982} Tweet_id {newsycbot, RideImpala, fastly, llvmorg} User_name {1442865158, 1442828307, 1442865156, 1442865155} Created_at {Visual exp…, Introducing .., Missing July…, LLVM 3.7….} text SELECT COUNT(*) FROM tweets WHERE user_name = ‘newsycbot’; Only read 1 column 1GB 2GB 1GB 200GB
  • 24. © Cloudera, Inc. All rights reserved. 24 Columnar Compression {1442865158, 1442828307, 1442865156, 1442865155} Created_at Created_at Diff(created_at) 1442865158 n/a 1442828307 -36851 1442865156 36849 1442865155 -1 64 bits each 17 bits each • Many columns can compress to a few bits per row! • Especially: • Timestamps • Time series values • Low-cardinality strings • Massive space savings and throughput increase!
  • 25. © Cloudera, Inc. All rights reserved. 25 Handling Inserts and Updates • Inserts go to an in-memory row store (MemRowSet) • Durable due to write-ahead logging • Later flush to columnar format on disk • Updates go to in-memory “delta store” • Later flush to “delta files” on disk • Eventually “compact” into the previously-written columnar data files • Details elided here due to time constraints • available in other slide decks online, or come to office hours to learn more!
  • 26. © Cloudera, Inc. All rights reserved. 26 Integrations
  • 27. © Cloudera, Inc. All rights reserved. 27 Spark DataSource Integration (WIP) sqlContext.load("org.kududb.spark", Map("kudu.table" -> “foo”, "kudu.master" -> “master.example.com”)) .registerTempTable(“mytable”) df = sqlContext.sql( “select col_a, col_b from mytable “ + “where col_c = 123”) Available in Kudu 0.7.0, but still being improved
  • 28. © Cloudera, Inc. All rights reserved. 28 Impala Integration • CREATE TABLE … DISTRIBUTE BY HASH(col1) INTO 16 BUCKETS AS SELECT … FROM … • INSERT/UPDATE/DELETE • Optimizations like predicate pushdown, scan parallelism, more on the way • Not an Impala user? Community working on other integrations (Hive, Drill, Presto, Phoenix)
  • 29. © Cloudera, Inc. All rights reserved. 29 MapReduce Integration • Multi-framework cluster (MR + HDFS + Kudu on the same disks) • KuduTableInputFormat / KuduTableOutputFormat • Support for pushing predicates, column projections, etc
  • 30. © Cloudera, Inc. All rights reserved. 30 Performance 30
  • 31. © Cloudera, Inc. All rights reserved. 31 TPC-H (Analytics benchmark) • 75 server cluster • 12 (spinning) disk each, enough RAM to fit dataset • TPC-H Scale Factor 100 (100GB) • Example query: • SELECT n_name, sum(l_extendedprice * (1 - l_discount)) as revenue FROM customer, orders, lineitem, supplier, nation, region WHERE c_custkey = o_custkey AND l_orderkey = o_orderkey AND l_suppkey = s_suppkey AND c_nationkey = s_nationkey AND s_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA' AND o_orderdate >= date '1994-01-01' AND o_orderdate < '1995-01-01’ GROUP BY n_name ORDER BY revenue desc; 31
  • 32. © Cloudera, Inc. All rights reserved. 32 - Kudu outperforms Parquet by 31% (geometric mean) for RAM-resident data
  • 33. © Cloudera, Inc. All rights reserved. 33 Versus other NoSQL Storage • Phoenix: SQL layer on HBase • 10 node cluster (9 worker, 1 master) • TPC-H LINEITEM table only (6B rows) 33 2152 219 76 131 0 1918. 13.2 1.7 0.7 0.2 155. 9.3 1.4 1.5 1.4 0. 0.1 1. 10. 100. 1000. 10000. Load TPCH Q1 COUNT(*) COUNT(*) WHERE… single-row lookup Time(sec) Phoenix Kudu Parquet
  • 34. © Cloudera, Inc. All rights reserved. 34 What about NoSQL-style Random Access? (YCSB) • YCSB 0.5.0-snapshot • 10 node cluster (9 worker, 1 master) • 100M row data set • 10M operations each workload 34
  • 35. © Cloudera, Inc. All rights reserved. 35 Getting started 35
  • 36. © Cloudera, Inc. All rights reserved. 36 Project Status • Open source beta released in September • Latest release 0.7.1 hot off the presses • Usable for many applications • Have not experienced unrecoverable data loss, reasonably stable (almost no crashes reported). Users testing up to 200 nodes so far. • Still requires some expert assistance, and you’ll probably find some bugs • Part of the Apache Software Foundation Incubator • Community-driven open source process
  • 37. © Cloudera, Inc. All rights reserved. 37 Apache Kudu Community
  • 38. © Cloudera, Inc. All rights reserved. 38 Getting Started As a User • http://getkudu.io • user@kudu.incubator.apache.org • http://getkudu-slack.herokuapp.com/ • Quickstart VM • Easiest way to get started • Impala and Kudu in an easy-to-install VM • CSD and Parcels • For installation on a Cloudera Manager-managed cluster 38
  • 39. © Cloudera, Inc. All rights reserved. 39 Getting Started As a Developer • http://github.com/apache/incubator-kudu • Code reviews: http://gerrit.cloudera.org • Public JIRA: http://issues.apache.org/jira/browse/KUDU • Includes bugs going back to 2013. Come see our dirty laundry! • Mailing list: dev@kudu.incubator.apache.org • Apache 2.0 license open source • Contributions are welcome and encouraged! 39
  • 40. © Cloudera, Inc. All rights reserved. 40 http://getkudu.io/ @getkudu
  • 41. © Cloudera, Inc. All rights reserved. 41 Backup slides
  • 42. © Cloudera, Inc. All rights reserved. 42 How it works Write and read paths 42
  • 43. © Cloudera, Inc. All rights reserved. 43 Kudu storage – Inserts and Flushes 43 MemRowSet INSERT(“todd”, “$1000”,”engineer”) name pay role DiskRowSet 1 flush
  • 44. © Cloudera, Inc. All rights reserved. 44 Kudu storage – Inserts and Flushes 44 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 INSERT(“doug”, “$1B”, “Hadoop man”) flush
  • 45. © Cloudera, Inc. All rights reserved. 45 Kudu storage - Updates 45 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta MS Delta MS Each DiskRowSet has its own DeltaMemStore to accumulate updates base data base data
  • 46. © Cloudera, Inc. All rights reserved. 46 Kudu storage - Updates 46 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta MS Delta MS UPDATE set pay=“$1M” WHERE name=“todd” Is the row in DiskRowSet 2? (check bloom filters) Is the row in DiskRowSet 1? (check bloom filters) Bloom says: no! Bloom says: maybe! Search key column to find offset: rowid = 150 150: pay=$1M @ time T base data
  • 47. © Cloudera, Inc. All rights reserved. 47 Kudu storage – Read path 47 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta MS Delta MS 150: pay=$1M @ time T Read rows in DiskRowSet 2 Then, read rows in DiskRowSet 1 Updates are applied based on ordinal offset within DRS: array indexing = fast base data base data
  • 48. © Cloudera, Inc. All rights reserved. 48 Kudu storage – Delta flushes 48 MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta MS Delta MS 150: pay=$1M @ time T REDO DeltaFile Flush A REDO delta indicates how to transform between the ‘base data’ (columnar) and a later version base data base data
  • 49. © Cloudera, Inc. All rights reserved. 49 Kudu storage – Major delta compaction 49 name pay role DiskRowSet(pre-compaction) Delta MS REDO DeltaFile REDO DeltaFile REDO DeltaFile Many deltas accumulate: lots of delta application work on reads Merge updates for columns with high update percentage name pay role DiskRowSet(post-compaction) Delta MS Unmerged REDO deltasUNDO deltas base data If a column has few updates, doesn’t need to be re- written: those deltas maintained in new DeltaFile
  • 50. © Cloudera, Inc. All rights reserved. 50 Kudu storage – RowSet Compactions DRS 1 (32MB) [PK=alice], [PK=joe], [PK=linda], [PK=zach] DRS 2 (32MB) [PK=bob], [PK=jon], [PK=mary] [PK=zeke] DRS 3 (32MB) [PK=carl], [PK=julie], [PK=omar] [PK=zoe] DRS 4 (32MB) DRS 5 (32MB) DRS 6 (32MB) [alice, bob, carl, joe] [jon, julie, linda, mary] [omar, zach, zeke, zoe] Reorganize rows to avoid rowsets with overlapping key ranges