At Dropbox we are currently handling approximately 10,000,000 messages per second at peak across our handful of Kafka clusters. The largest of which has hit throughputs of 7,000,000 per second (~30 Gbps) on only 20 nodes. We’ll walk you through the steps we took to get where we are, the design that works for us — and those that didn’t. We’ll talk about the tooling we had to build and what we want to see exist.
We’ll dive deeper into configuration and provide a blueprint you can follow. We’ll talk about the trials and tribulations of using Kafka — including ways we’ve set our clusters on fire, ways we’ve lost data, ways we’ve turned our hairs gray, and ways we’ve heroically saved the day for our users. Finally, we’ll spend time on some of the work we’re doing to handle consumer coordination across our many different systems and to integrate Kafka into a well established corporate infrastructure. (I.e., making Kafka “”play nice”” with everybody.)
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
1. Deploying Kafka at Dropbox
Alternately: how to handle 10,000,000 QPS in one cluster (but don't)
2. The Plan
• Welcome
• Use Case
• Initial Design
• Iterations of Woe
• Current Setup
• Future Plans
3. Your Speakers
• Mark Smith <zorkian@dropbox.com>
formerly of Google, Bump, StumbleUpon, etc
likes small airplanes and not getting paged
• Sean Fellows <fellows@dropbox.com>
formerly of Google
likes corgis and distributed systems
4. The Plan
• Welcome
• Use Case
• Initial Design
• Iterations of Woe
• Current Setup
• Future Plans
5. Dropbox
• Over 500 million signups
• Exabyte scale storage system
• Multiple hardware locations + AWS
6. Log Events
• Wide distribution (1,000 categories)
• Several do >1M QPS each + long tail
• About 200TB/day (raw)
• Payloads range from empty to 15MB JSON blobs
7. Current System
• Existing system based on Scribe + HDFS
• Aggregate to single destination for analytics
• Powers Hive and standard map-reduce type analytics
Want: real-time stream processing!
8. The Plan
• Welcome
• Use Case
• Initial Design
• Iterations of Woe
• Current Setup
• Future Plans
9. Initial Design
• One big cluster
• 20 brokers: 96GB RAM, 16x2TB disk, JBOD config
• ZK ensemble run separately (5 members)
• Kafka 0.8.2 from Github
• LinkedIn configuration recommendations
10. The Plan
• Welcome
• Use Case
• Initial Design
• Iterations of Woe
• Current Setup
• Future Plans
11. Unexpected Catastrophes
• Disks failure or reaching 100%
• Repair is manual, won't expire unless caught up
• Crash looping, controller load
• Simultaneous restarts
• Even graceful, recovery is sometimes very bad (even 0.9!)
• Rebalancing is dangerous
• Saturates disks, partitions fall out of ISRs, offline, etc
12. System Errors
• Controller issues
• Sometimes goes AWOL with e.g. big rebalances
• Can have multiple controllers (during serial operations)
• Cascading OOMs
• Too many connections
13. Lack of Tooling
• Usually left to the reader
• Few best practices
• But we love Kafka Manager
• More to come later!
14. Newer Clients
• State of Go/Python clients
• Bad behavior at scale
• Laserbeam, retries, backoff
• Too many connections == OOM
• Good clients take time
15. Bad Configs
• Many, many tunables -- lots of rope
• Unclean leader election
• Preferred leader automation
• Disk threads (thanks Gwen!)
• Little modern documentation on running at scale
• Todd Palino helped us out early, tho, so thank you!
16. The Plan
• Welcome
• Use Case
• Initial Design
• Iterations of Woe
• Current Setup
• Future Plans
17. Hardware
• Hardware RAID 10
• ~25TB usable/box (spinning rust)
• During broker replacement
• 200ms p99 commit latency down to 10ms!
• Failure tolerance, full disk protection
• Canary cluster
18. Monitoring
• MPS vs QPS (metadata reqs!)
• Bad Stuff graph
• Disk utilization/latency
• Heap usage
• Number of controllers
22. Customer Culture
• Topics : organization :: partitions : scale
• Do not hash to partitions
• No ordering requirements
• Namespaces and ownership are required
23. Success! x
• Kafka goes fast (18M+ MPS on 20 brokers)
• Multiple parallel consumption
• Low latency (at high produce rates)
• 0.9 is leaps ahead of 0.8.2 (upgrade!)
• Supportable by a small team (at our scale)
24. The Plan
• Welcome
• Use Case
• Initial Design
• Iterations of Woe
• Current Setup
• Future Plans
25. The Future
• Big is fun but has problems
• Open source our tooling
• Moving towards replication
• Automatic up-partitioning and rebalancing
• Expanding auditing to clients
• Low volume latencies
26. Deploying Kafka at Dropbox
• Mark Smith <zorkian@dropbox.com>
• Sean Fellows <fellows@dropbox.com>
We would love to talk with other people who are running Kafka at similar
scales. Email us!
And... questions! (If we have time.)