3. LI @ 2010: Point-to-Point Data Pipeline
KV-Store Doc-Store RDBMS
Tracking Logs / Metrics
Hadoop / DW Monitoring Rec. Engine Social GraphSearchingSecurity …
3
4. KV-Store Doc-Store RDBMS
Tracking Logs / Metrics
Hadoop / DW Monitoring Rec. Engine Social GraphSearchingSecurity …
LI @ 2010: Point-to-Point Data Pipeline
What we want: a centralized data pipeline
4
8. A Short History
• 2010.10: First commit of Kafka
• 2011.07: Enters Apache Incubator
• Release 0.7.0: compression, mirror-maker
8
9. Kafka 0.7 Message Format
…
Offset = 0 Offset = M Offset = M+N
M Bytes N Bytes
Message offsets are Physical
9
10. Kafka 0.7 Message Format
…
Offset = 0 Offset = M Offset = M+N
M Bytes N Bytes
Internal messages of a compressed message are not offset-addressable
10
11. Kafka 0.7 Message Format
…
Offset = 0 Offset = M Offset = M+N
M Bytes N Bytes
Internal messages of a compressed message are not offset-addressable
Consumer can only checkpoint
offset M for this message
11
12. Drawbacks of Kafka 0.7
• Hard to checkpoint within compressed message set
• At-least-once: could consume twice
• Hard to rewind consumption by #.messages
• Similarly, tricky to monitor consumption lag
• Unsuitable for features like log compaction (will talk later)
12
13. BUT!
• Very dumb efficient (high-throughput)
• Just bytes-in-bytes-out for brokers
• Hence the bottleneck was predominantly network
• 1Gbps NICs are saturated most of the time
• CPU usages usually < 10%
• IO Flops are low thanks to “zero-copy”
13
27. Shifting Bottleneck
• Predominantly network in 0.7
• Just bytes-in-bytes-out for brokers
• Still network in 0.8, but tilting towards CPU / Storage
• More CPU cost due to decompress / re-compress
• Data replication, consensus protocol:
• One message now copied X times for replication
27
28. A Short History
• 2010.10: First commit of Kafka
• 2011.07: Enters Apache Incubator
• Release 0.7.0: compression, mirror-maker
• 2012.10: Graduated to top-level project
• Release 0.8.0: intra-cluster replication
• 2014.11: Confluent founded
• Release 0.8.2: new producer
• Release 0.9.0: new consumer, quota, security
28
33. Quota
• Who to limit: client-id
• What to limit: Mbps
• Defined on per-broker basis (bytes-in for producer, bytes-out for consumer)
• Throttle on violation
// Default bytes-out per client.
quota.consumer.default=2M
quota.producer.default=2M
// Overrides
quota.producer.override="clientA:4M,clientB:10M” 33
34. Security
• Authentication (SSL) and authorization
• Who (client-id) can do what (create, read, write, etc)
• Minor impact on throughput
• Forego “zero-copy” optimization
• CPU overhead to decrypt / encrypt
34
44. Example: Data Store Geo-Replication
Apache
Local Stores
User Apps User Apps
Local Stores
Apache
Region 2Region 1
write read
append log
mirroring
apply log
44
45. Shifting Bottleneck
• Predominantly network in 0.7
• Just bytes-in-bytes-out for brokers
• Storage and network in 0.8
• 1Gbps NICs
• Increasing CPU in 0.9
• New hardware since 2015: bigger disks, XFS, 10Gbps NICs..
• De(re)compress, de(en)crypt, compaction, coordination, etc..
45
46. A Short History
• 2010.10: First commit of Kafka
• 2011.07: Enters Apache Incubator
• Release 0.7.0: compression, mirror-maker
• 2012.10: Graduated to top-level project
• Release 0.8.0: intra-cluster replication
• 2014.11: Confluent founded
• Release 0.8.2: new producer
• Release 0.9.0: new consumer, quota, security
• Release 0.10.0: timestamps, rack awareness 46
47. Coarsened “Time” under Kafka 0.9
...
Partition Messages
Segment-3 Segment-4 Segment-5
Message time is the mtime of the
segment file, so one stamp per-segment
47
48. Kafka 0.10 Message Format
…
Offset = 0 Offset = 1 Offset = 2
M Bytes N Bytes
Timestamp = 25 Timestamp = 55 Timestamp = 60
48
49. Finer “Time” in Kafka 0.10
...
Partition Messages
Segment-3 Segment-4 Segment-5
Timestamp per message (create time)
or per-message-set (append time)
time offset time offset time offset
49
50. Finer “Time” in Kafka 0.10
...
Partition Messages
Segment-3 Segment-4 Segment-5
time offset time offset time offset
Timestamp per message (create time)
or per-message-set (append time)
Finer-grained time-based lookup (fetch offset request)
50
51. Finer “Time” in Kafka 0.10
...
Partition Messages
Segment-3 Segment-4 Segment-5
time offset time offset time offset
Timestamp per message (create time)
or per-message-set (append time)
More accurate time-based log rolling / log retention
51
52. A Short History
• 2016.04: First Kafka Summit @San Francisco
• Release 0.9.0: Kafka Connect
• Release 0.10.0: Kafka Streams
52
53. Kafka Streams (0.10+)
• New client library besides producer and consumer
• Powerful yet easy-to-use
• Event-at-a-time, Stateful
• Windowing with out-of-order handling
• Highly scalable, distributed, fault tolerant
• and more..
[BIRTE 2015]
53
64. A Short History
• 2016.04: First Kafka Summit @San Francisco
• Release 0.9.0: Kafka Connect
• Release 0.10.0: Kafka Streams
• 2017.08: Apache Kafka Goes 1.0
• Release 0.11.0: Exactly-Once
64
65. Stream Processing with Kafka
Process
State
Ads Clicks
Ads Displays
Billing Updates
Fraud Suspects
Your App
65
66. Stream Processing with Kafka
Process
State
Ads Clicks
Ads Displays
Billing Updates
Fraud Suspects
ack
ack
commit
Your App
66
67. Error Scenario #1: Duplicate Writes
Process
State
Kafka Topic A
Kafka Topic B
Kafka Topic C
Kafka Topic D
ack
Streams App
67
68. Error Scenario #1: Duplicate Writes
Process
State
Kafka Topic A
Kafka Topic B
Kafka Topic C
Kafka Topic D
ack
producer config: retries = N
Streams App
68
69. Error Scenario #2: Re-process
Kafka Topic A
Kafka Topic B
Kafka Topic C
Kafka Topic D
commit
ack
ack
State
Process
Streams App
69
70. Error Scenario #2: Re-process
State
Process
Kafka Topic A
Kafka Topic B
Kafka Topic C
Kafka Topic D
State
Streams App
70
71. Exactly-Once
An application property for stream processing,
.. that for each received record,
.. its process results will be reflected exactly once,
.. even under failures
71
77. • Building blocks to achieve exactly-once
• Idempotence: de-duped sends in order per partition
• Transactions: atomic multiple-sends across topic partitions
• Kafka Streams: enable exactly-once in a single knob
Exactly-once, the Kafka Way!(0.11+)
77
79. Shifting Bottleneck, Continued..
• Storage and network in 0.7 / 0.8
• Increasing CPU in 0.9 - 0.11
• TLS, CRC, Quota, protocol down-conversion
• Txn coordination, etc..
• Operation efficiency on 0.11+
• Leader balancing, partition migration, cluster expansion..
• ZK-dependency, multi-DC support ..
79
80. A Short History
• 2016.04: First Kafka Summit @San Francisco
• Release 0.9.0: Kafka Connect
• Release 0.10.0: Kafka Streams
• 2017.08: Apache Kafka Goes 1.0
• Release 0.11.0: Exactly-Once
• Release 1.0+: More security and operability (controller re-design, Java9 withTLS / CRC, etc)
• Release 1.0+: Better scalability (JBOD, etc)
• Release 1.0+: Online-evolvability (down-conversion optimization, etc)
80
81. • One broker in a cluster acts as controller
• Monitor the liveness of brokers (via ZK)
• Select new leaders on broker failures
• Communicate new leaders to brokers
• Controller re-elected on failures (via ZK)
Example: Controller Re-design
81
95. Kafka: Evolvable System
• Zero down-time
• Maintenance outage? No such thing.
• All protocols versioned
• Brokers can talk to older versioned clients
• And vice versa since 0.10.2!
• One should do no more than rolling bounces
• Staging before production
95
106. Multi-tenancy Services (in the Cloud)
• Security from ground up
• End-to-end encryption
• ACL / RBAC definitions
• Resources under control
• Quota on bytes rate / request rate, CPU resources
• Capacity preservation / allocation to quotas
• Limit on num.connections, etc
106
112. What is Kafka, Really?
[NetDB 2011]
[Hadoop Summit 2013]
[VLDB 2015]
a scalable pub-sub messaging system..
a real-time data pipeline..
a distributed and replicated log..
112
113. Example: Data Store Geo-Replication
Apache
Local Stores
User Apps User Apps
Local Stores
Apache
Region 2Region 1
write read
append log
mirroring
apply log
113
114. What is Kafka, Really?
a scalable pub-sub messaging system.. [NetDB 2011]
a real-time data pipeline.. [Hadoop Summit 2013]
a distributed and replicated log.. [VLDB 2015]
a unified data integration stack.. [CIDR 2015]
114
116. What is Kafka, Really?
a scalable pub-sub messaging system.. [NetDB 2011]
a real-time data pipeline.. [Hadoop Summit 2013]
a distributed and replicated log.. [VLDB 2015]
a unified data integration stack.. [CIDR 2015]
All of them!
116
117. Kafka: Streaming Platform
• Publish / Subscribe
• Move data around as online streams
• Store
• “Source-of-truth” continuous data
• Process
• React / process data in real-time
117
118. Kafka adoption World-Wide
6 of the top 10 travel companies
8 of the top 10 insurance companies
7 of the top 10 global banks
9 of the top 10 telecom companies
118