Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Velocity 2019 - Kafka Operations Deep Dive

857 vues

Publié le

In which disk-related failure 
scenarios of Apache Kafka are discussed in unprecedented level of detail

Publié dans : Technologie
  • Login to see the comments

Velocity 2019 - Kafka Operations Deep Dive

  1. 1. 1 Monitor Disk Space and other ways to keep Kafka happy Gwen Shapira, @gwenshap, Software Engineer
  2. 2. Me • Software engineer @ Confluent • Committer on Apache Kafka • Co-author of “Kafka - the Definitive Guide” • Tweets a lot: @gwenshap • Learning to devops
  3. 3. 3 In which disk-related failure scenarios are discussed in unprecedented level of detail
  4. 4. 4 Apache Kafka in 3 slides
  5. 5. Producer Consume r Kafka Cluster Stream Processing Apps Connectors Connectors
  6. 6. Partitions • Kafka organizes messages into topics • Each topics have a set of partitions • Each partition is a replicated log of messages, referenced by sequential offset Partition 0 Partition 1 Partition 2 0 1 2 3 4 5 0 1 2 3 4 5 6 7 0 1 2 3 4 Offset
  7. 7. Replication • Each Partition is replicated 3 times • Each replica lives on separate broker • Leader handles all reads and writes. • Followers replicate events from leader. 01234567 Replica 1 Replica 2 Replica 3 01234567 01234567 Producer
  8. 8. 8 The way failures SHOULD go
  9. 9. 9 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 102 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/100, 101, 102
  10. 10. 10 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 102 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/101, 102 ✗
  11. 11. 11 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 102 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/101, 102 ✗ Oh no. Broker 100 is missing.
  12. 12. 12 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 102 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/101, 102 ✗ Broker 102: you now lead partition 1 Broker 101: you now follow broker 102 for partition 1
  13. 13. 13 Partition 1 Replica 102 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 100 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/101, 102 ✗
  14. 14. 14 Partition 1 Replica 102 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 100 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/100, 101, 102
  15. 15. 15 Partition 1 Replica 102 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 100 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/100, 101, 102 Broker 100 is back! Broker 100: Note the new leaders: 101 and 102
  16. 16. 16 Partition 1 Replica 102 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 100 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/100, 101, 102 What did I miss?
  17. 17. 17 Partition 1 Replica 102 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 100 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/100, 101, 102 What did I miss? Lots of events!
  18. 18. 18 Partition 1 Replica 102 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 100 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/100, 101, 102 Thanks guys, I caught up!
  19. 19. 19 Partition 1 Replica 102 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 100 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/100, 101, 102 Broker 100, you are preferred leader for partition 1 Broker 101, follow broker 100 for partition 1 Broker 102, follow broker 100 for partition 1
  20. 20. 20 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 102 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/100, 101, 102
  21. 21. 21 What could possibly go wrong?
  22. 22. 22 When Kafka runs out of disk space
  23. 23. 23 Best case scenario: Broker ran out of disk space and crashed.
  24. 24. 24 Solution: 1. Get bigger disks 2. Store less data
  25. 25. 25 What not to do. Ever: cat /dev/null > /data/log/my_topic- 15/00000000000001548736.log While Kafka is up and running.
  26. 26. 26 When you are in a hole Stop digging. Don’t know where the holes are? Walk slowly.
  27. 27. 27 General Tips for Stable Kafka ● Over-provision ● Upgrade to latest bug-fixes ● Don’t mess with stuff you don’t understand. ● Call support when you have to
  28. 28. 28 Bad Scenario https://issues.apache.org/jira/browse/KAFKA-7151
  29. 29. 29 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1000 Latest offset 1000 Latest offset 1000 Latest offset 1000 Latest offset 1000 Latest offset 1000
  30. 30. 30 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1010 Latest offset 1000 Latest offset 1000 Latest offset 1000 Latest offset 1000 Latest offset 1000 What did I miss? Anything after 1000?
  31. 31. 31 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1010 Latest offset 1000 Latest offset 1000 Latest offset 1000 Latest offset 1000 Latest offset 1000 Here is 1001 to 1010
  32. 32. 32 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1010 Latest offset 1010 Latest offset 1010 Latest offset 1000 Latest offset 1000 Latest offset 1010
  33. 33. 33 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1020 Latest offset 1010 Latest offset 1010 Latest offset 1000 Latest offset 1000 Latest offset 1010
  34. 34. 34 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1020 Latest offset 1010 Latest offset 1010 Latest offset 1000 Latest offset 1010 Latest offset 1010 What did I miss? Anything after 1010?
  35. 35. 35 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1020 Latest offset 1010 Latest offset 1010 Latest offset 1000 Latest offset 1010 Latest offset 1010 Too busy trying to access disk
  36. 36. 36 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1020 Latest offset 1010 Latest offset 1010 Latest offset 1000 Latest offset 1010 Latest offset 1010 IO ERROR
  37. 37. 37 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1020 Latest offset 1010 Latest offset 1010 Latest offset 1000 Latest offset 1010 Latest offset 1010 ✗ Too far behind to be leader Too far behind to be leader
  38. 38. 38 Downtime.
  39. 39. 39 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1000 Latest offset 1010 Latest offset 1010 Latest offset 1000 Latest offset 1010 Latest offset 1010 I’m back. As you know, I’m the leader. Based on my disk, latest event is 1000
  40. 40. 40 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Latest offset 1000 Latest offset 1010 Latest offset 1010 Latest offset 1000 Latest offset 1010 Latest offset 1010 What did I miss? LOL. No. Latest is 1010. We can’t follow you.
  41. 41. 41 Solution: Enable unclean leader election. Lose messages 1010-1020.
  42. 42. 42 Solution: https://issues.apache.org/jira/browse/KAFKA-7151
  43. 43. 43 Solution
  44. 44. 45 Systems Hierarchy of Needs CPU Bandwidth Disk RAM
  45. 45. 46 Most common Symptom: Under-replicated partitions You basically can’t alert on that. We monitor the resources, act early and add resources.
  46. 46. 47 How to add CPU / Bandwidth? Normally by adding brokers And rebalancing partitions
  47. 47. 48 When good EBS volumes go bad
  48. 48. 49 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Partition 1 Replica 102 Hanging. Not talking to anyone Zookeeper: /brokers/100, 101, 102
  49. 49. 50 What will happen? Lets zoom in
  50. 50. 51 Zookeeper: /brokers/100, 101, 102 Network Threads Also reading from disk Partition 1 Replica 101 Broker 100 Broker 101 Partition 2 Replica 101 Leader Partition 1 Replica 100 Leader Partition 2 Replica 100Request Threads Writing to disk Replica Fetchers Reading from leader Writing to disk Zookeeper Client No disks involved Hanging. Not talking to anyone
  51. 51. 52 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 102 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/101, 102 ✗Only part of the broker that is alive!
  52. 52. 53 Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Broker 102 Controller Partition 1 Replica 102 Partition 2 Replica 100 Partition 2 Replica 101 Leader Partition 2 Replica 102 Zookeeper: /brokers/101, 102 ✗ Broker 100 is totally alive! No need to elect leaders!
  53. 53. 54 Downtime.
  54. 54. 55 Solution: Stop the broker ASAP. Open ticket to replace disk
  55. 55. 56 How to detect this? ● Broker is up ● Logs look fine ● Request Handler idle% is 0 ● Network Handler idle% is 0 ● Client time-out
  56. 56. 57
  57. 57. 58 Canary ● Lead partition on every broker ● Produce and Consume ● Every 10 seconds ● Yell if 3 consecutive misses Partition 1 Replica 100 Leader Partition 1 Replica 101 Broker 100 Broker 101 Partition 2 Replica 100 Partition 2 Replica 101 Leader
  58. 58. 59 You can reuse canary for simple failure injection testing
  59. 59. 60 Summary
  60. 60. 61 You don’t really know how your software will behave until it is in production for quite a while.
  61. 61. 62 More Key Points ● Keep an eye on your key resources ● Tread carefully in unknown territory ● Sometimes crashed broker is GOOD ● Monitor user scenarios - especially for SLAs
  62. 62. 63
  63. 63. 64

×