Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Rebalance Protocol Inside-out: a Developer Perspective

1 304 vues

Publié le

Speaker: Boyang Chen, Infrastructure Engineer, Confluent

Rebalance protocol is the coordination algorithm to process data in a dynamic scaling fashion for general Kafka clients, including Consumer, Connect and Streams. If you have ever been in the shoes of a Kafka application developer, you should have heard of this term and even been bitten by it some times. In fact, it is one of the known performance killers for large member group or state heavy application as of today.
In this talk, we will deep dive into this protocol, demo some troubleshooting experience and introduce two most recent improvements on top: static membership and incremental rebalancing. After the talk, you would gain a deeper understanding of the rebalance protocol, which shall boost your Kafka development velocity right away!

https://www.meetup.com/KafkaBayArea/events/261932534/

Publié dans : Technologie
  • Identifiez-vous pour voir les commentaires

Rebalance Protocol Inside-out: a Developer Perspective

  1. 1. 1 1 Rebalance Protocol inside-out: a Developer Perspective Boyang Chen
  2. 2. 2 2 2 Agenda ● What is rebalancing? ● How to design a rebalancing protocol? ● Unnecessary rebalances ● Look into the future: static membership ● Debugging tip 2
  3. 3. 3 3 Boyang Chen’s Bio
  4. 4. 4 4 Boyang Chen’s Bio ● Software engineer (Kafka Streams)
  5. 5. 5 5 Boyang Chen’s Bio ● Software engineer (Kafka Streams) ● Software engineer (Ads infrastructure)
  6. 6. 6 6 Boyang Chen’s Bio ● Software engineer (Kafka Streams) ● Software engineer (Ads infrastructure) ● Kafka Summit SF 2018: Building Pinterest Real-Time Ads Platform Using Kafka Streams
  7. 7. 7 7 Boyang Chen’s Bio ● Software engineer (Kafka Streams) ● Software engineer (Ads infrastructure) ● Kafka Summit SF 2018: Building Pinterest Real-Time Ads Platform Using Kafka Streams ● Kafka Summit SF 2019: Static Membership: Rebalance Strategy Designed for the Cloud
  8. 8. What is rebalancing? 8
  9. 9. What is rebalancing? ● Group membership 9
  10. 10. What is rebalancing? ● Group membership ● Resource assignment 10
  11. 11. What is rebalancing? ● Group membership ● Resource assignment ● Example: Coordinator – Worker model 11Coordinator T1 T2 T3 T5T4 T6
  12. 12. What is rebalancing? ● Group membership ● Resource assignment ● Example: Coordinator – Worker model 1. New members join the group 12Coordinator T1 T2 T3 T5T4 T6
  13. 13. What is rebalancing? ● Group membership ● Resource assignment ● Example: Coordinator – Worker model 1. New members join the group 13Coordinator T1 T2 T3 T5T4 T6
  14. 14. What is rebalancing? ● Group membership ● Resource assignment ● Example: Coordinator – Worker model 1. New members join the group 2. Perform assignment 14Coordinator T1 T2 T3 T5T4 T6
  15. 15. What is rebalancing? ● Group membership ● Resource assignment ● Example: Coordinator – Worker model 1. New members join the group 2. Perform assignment 3. Propagate 15Coordinator T3T1 T2 T1 T2 T3 T5T4 T6
  16. 16. What is rebalancing? ● Group membership ● Resource assignment ● Example: Coordinator – Worker model 1. New members join the group 2. Perform assignment 3. Propagate 4. Done! 16Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5
  17. 17. 17 17 17 How to design a rebalance protocol?
  18. 18. 18 18 18 How to design a rebalance protocol? 1. Membership changes:
  19. 19. 19 19 19 How to design a rebalance protocol? 1. Membership changes: (a) Member joins/leaves the group
  20. 20. 20 20 20 How to design a rebalance protocol? 1. Membership changes: (a) Member joins/leaves the group (b) Member times out
  21. 21. 21 21 21 How to design a rebalance protocol? 1. Membership changes: (a) Member joins/leaves the group (b) Member times out (c) Zombie member fencing
  22. 22. 22 22 22 How to design a rebalance protocol? 1. Membership changes: (a) Member joins/leaves the group (b) Member times out (c) Zombie member fencing 2. Assignor changes
  23. 23. 23 23 23 How to design a rebalance protocol? 1. Membership changes: (a) Member joins/leaves the group (b) Member times out (c) Zombie member fencing 2. Assignor changes 3. Task changes
  24. 24. Catch membership change 1. Spin up a new member 24Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5
  25. 25. 25Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5 Catch membership change 1. Spin up a new member 2. A new member joins
  26. 26. 26Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5 Catch membership change 1. Spin up a new member 2. A new member joins 3. Revoke active tasks 4. Require members to rejoin
  27. 27. Catch membership change 1. Spin up a new member 2. A new member joins 3. Revoke active tasks 4. Require members to rejoin 27Coordinator T3T1 T2 T6T4 T5 T1 T2 T3 T5T4 T6
  28. 28. Catch membership change 1. Spin up a new member 2. A new member joins 3. Revoke active tasks 4. Require members to rejoin 5. Perform new assignment 28Coordinator T3T1 T2 T6T4 T5 T1 T2 T3 T5T4 T6
  29. 29. Catch membership change 1. Spin up a new member 2. A new member joins 3. Revoke active tasks 4. Require members to rejoin 5. Perform new assignment 6. Propagate to members 29Coordinator T6T4 T5 T1 T2 T3 T5T4 T6 T4T1
  30. 30. Catch membership change 1. Spin up a new member 2. A new member joins 3. Revoke active tasks 4. Require members to rejoin 5. Perform new assignment 6. Propagate to members 30Coordinator T1 T2 T3 T5T4 T6 T4T1 T5T2
  31. 31. Catch membership change 1. Spin up a new member 2. A new member joins 3. Revoke active tasks 4. Require members to rejoin 5. Perform new assignment 6. Propagate to members 7. Done! 31Coordinator T1 T2 T3 T5T4 T6 T4T1 T5T2 T6T3
  32. 32. Catch membership change 1. Remove an active member 32Coordinator T1 T2 T3 T5T4 T6 T4T1 T5T2 T6T3
  33. 33. Catch membership change 1. Remove an active member 2. Member sends leave group request 33Coordinator T1 T2 T3 T5T4 T6 T4T1 T5T2 T6T3
  34. 34. Catch membership change 1. Remove an active member 2. Member sends leave group request 3. Revoke other members’ active tasks 34Coordinator T1 T2 T3 T5T4 T6 T4T1 T5T2
  35. 35. Catch membership change 1. Remove an active member 2. Member sends leave group request 3. Revoke other members’ active tasks 4. Require members to rejoin 35Coordinator T1 T2 T3 T5T4 T6 T4T1 T5T2
  36. 36. Catch membership change 1. Remove an active member 2. Member sends leave group request 3. Revoke other members’ active tasks 4. Require members to rejoin 36Coordinator T1 T2 T3 T5T4 T6 T4T1 T5T2
  37. 37. Catch membership change 1. Remove an active member 2. Member sends leave group request 3. Revoke other members’ active tasks 4. Require members to rejoin 5. Perform assignment 37Coordinator T1 T2 T3 T5T4 T6 T4T1 T5T2
  38. 38. Catch membership change 1. Remove an active member 2. Member sends leave group request 3. Revoke other members’ active tasks 4. Require members to rejoin 5. Perform assignment 6. Propagate to members 38Coordinator T1 T2 T3 T5T4 T6 T5T2T3T1 T2
  39. 39. Catch membership change 1. Remove an active member 2. Member sends leave group request 3. Revoke other members’ active tasks 4. Require members to rejoin 5. Perform assignment 6. Propagate to members 7. Done! 39Coordinator T1 T2 T3 T5T4 T6 T3T1 T2 T6T4 T5
  40. 40. 40 40 40 How to design a rebalance protocol? 1. Membership changes: (a) Member joins/leaves the group (b) Member times out (c) Zombie member fencing 2. Assignor changes 3. Task changes
  41. 41. Timeout configs 41
  42. 42. Timeout configs ● Liveness guarantee ○ session.timeout.ms 42
  43. 43. Timeout configs ● Liveness guarantee ○ session.timeout.ms ● Progress guarantee ○ max.poll.interval.ms ○ rebalance.timeout.ms 43
  44. 44. Session timeout 44Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5
  45. 45. Session timeout ● Background thread sends heartbeat 45Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5
  46. 46. Session timeout 1. Member crashes without sending leave group 46Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5
  47. 47. Session timeout 1. Member crashes without sending leave group 2. Session timeout reaches 47Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5
  48. 48. Session timeout 1. Member crashes without sending leave group 2. Session timeout reaches 3. Require other members to revoke tasks/rejoin 48Coordinator T3T1 T2 T1 T2 T3 T5T4 T6
  49. 49. Session timeout 1. Member crashes without sending leave group 2. Session timeout reaches 3. Require other members to revoke tasks/rejoin 49Coordinator T3T1 T2 T1 T2 T3 T5T4 T6
  50. 50. Session timeout 1. Member crashes without sending leave group 2. Session timeout reaches 3. Require other members to revoke tasks/rejoin 4. Perform Assignment 50Coordinator T3T1 T2 T1 T2 T3 T5T4 T6
  51. 51. Session timeout 1. Member crashes without sending leave group 2. Session timeout reaches 3. Require other members to revoke tasks/rejoin 4. Perform Assignment 5. Propagate… 6. Done! 51Coordinator T3T1 T2 T6T4 T5 T1 T2 T3 T5T4 T6
  52. 52. Max poll interval timeout 52Coordinator T3T1 T2 T1 T2 T4T3 T4
  53. 53. Max poll interval timeout ● Poll – Process – Commit 53Coordinator T3T1 T2 Poll() … Poll() … T1 T2 T4T3 T4
  54. 54. Max poll interval timeout ● Poll – Process – Commit 54Coordinator T3T1 T2 Poll() … Poll() … Poll() … Poll() … T1 T2 T4T3 T4
  55. 55. Max poll interval timeout ● Poll – Process – Commit ● One process takes too long 55Coordinator T3T1 T2 Poll() … Poll() … Poll() … T1 T2 T4T3 Poll() … Poll() … Poll() ………… T4
  56. 56. Poll() … Poll() … Poll() ………… ………… Max poll interval timeout ● Poll – Process – Commit ● One process takes too long ● Reach timeout limit 56Coordinator T3T1 T2 Poll() … Poll() … Poll() … T1 T2 T4T3 T4
  57. 57. Poll() … Poll() … Poll() ………… ………… >= max.poll.interval.ms Max poll interval timeout ● Poll – Process – Commit ● One process takes too long ● Reach timeout limit 1. Member takes too long to process 57Coordinator T3T1 T2 Poll() … Poll() … Poll() … T1 T2 T4T3 T4
  58. 58. Poll() … Poll() … Poll() ………… ………… >= max.poll.interval.ms Max poll interval timeout ● Poll – Process – Commit ● One process takes too long ● Reach timeout limit 1. Member takes too long to process 2. Background thread stops heartbeat and sends leave group 58Coordinator T3T1 T2 Poll() … Poll() … Poll() …T4 T1 T2 T4T3
  59. 59. Max poll interval timeout ● Poll – Process – Commit ● One process takes too long ● Reach timeout limit 1. Member takes too long to process 2. Background thread stops heartbeat and sends leave group 3. Ask others to revoke task/rejoin 59Coordinator T3T1 T2 T4 T1 T2 T4T3
  60. 60. Max poll interval timeout ● Poll – Process – Commit ● One process takes too long ● Reach timeout limit 1. Member takes too long to process 2. Background thread stops heartbeat and sends leave group 3. Ask others to revoke task/rejoin 4. Perform assignment 60Coordinator T3T1 T2 T4 T1 T2 T4T3
  61. 61. Max poll interval timeout ● Poll – Process – Commit ● One process takes too long ● Reach timeout limit 1. Member takes too long to process 2. Background thread stops heartbeat and sends leave group 3. Ask others to revoke task/rejoin 4. Perform assignment 5. Propagate to members, done! 61Coordinator T1 T2 T3 T5T4 T6 T4T2T3T1
  62. 62. Rebalance timeout ● Max time for a member to rejoin during rebalance 62Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5
  63. 63. Rebalance timeout ● Max time for a member to rejoin during rebalance ● Use the max value of max.poll.interval among all clients ○ Member has to finish ongoing work 63Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5
  64. 64. Rebalance timeout ● Max time for a member to rejoin during rebalance ● Use the max value of max.poll.interval among all clients ○ Member has to finish ongoing work ● Track with given member id 64Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5 m1, m2 Members mID: m1 mID: m2
  65. 65. Rebalance timeout ● Max time for a member to rejoin during rebalance ● Use the max value of max.poll.interval among all clients ○ Member has to finish ongoing work ● Track with given member id ● Register callback 65Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 T6T4 T5 m1, m2 Members mID: m1 mID: m2 Join callback
  66. 66. Rebalance timeout ● … ● Register callback 1. Group starts to rebalance 66Coordinator T3T1 T2 T6T4 T5 T1 T2 T3 T5T4 T6 mID: m1 mID: m2 m1, m2 MembersJoin callback
  67. 67. Rebalance timeout ● … ● Register callback 1. Group starts to rebalance 2. Member m1 rejoins successfully 67Coordinator T3T1 T2 T6T4 T5 T1 T2 T3 T5T4 T6 mID: m1 mID: m2 m1, m2 Members <m1> Join callback
  68. 68. Rebalance timeout ● … ● Register callback 1. Group starts to rebalance 2. Member m1 rejoins successfully 3. Member m2 gets stuck 68Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 mID: m1 mID: m2 m1, m2 Members <m1> Join callback T6T4 T5
  69. 69. Rebalance timeout ● … ● Register callback 1. Group starts to rebalance 2. Member m1 rejoins successfully 3. Member m2 gets stuck 4. Rebalance timeout reached 69Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 mID: m1 mID: m2 m1, m2 Members <m1> Join callback T6T4 T5
  70. 70. Rebalance timeout ● … ● Register callback 1. Group starts to rebalance 2. Member m1 rejoins successfully 3. Member m2 gets stuck 4. Rebalance timeout reached 5. Perform assignment 70Coordinator T3T1 T2 T1 T2 T3 T5T4 T6 mID: m1 mID: m2 m1 Members <m1> Join callback T6T4 T5
  71. 71. Rebalance timeout ● … ● Register callback 1. Group starts to rebalance 2. Member m1 rejoins successfully 3. Member m2 gets stuck 4. Rebalance timeout reached 5. Perform assignment 6. Propagate, done! 71Coordinator mID: m1 mID: m2 m1 MembersJoin callback T6T4 T5T3T1 T2 T6T4 T5 T1 T2 T3 T5T4 T6 mID: m1
  72. 72. 72 72 72 How to design a rebalance protocol? 1. Membership changes: (a) Member joins/leaves the group (b) Member times out (c) Zombie member fencing 2. Assignor changes 3. Task changes
  73. 73. Fencing zombie 73Coordinator mID: m1 mID: m2 m1 Members T6T4 T5T3T1 T2 T6T4 T5 T1 T2 T3 T5T4 T6 mID: m1
  74. 74. Fencing zombie ● What if a zombie member rejoins? 74Coordinator mID: m1 mID: m2 m1 Members T6T4 T5T3T1 T2 T6T4 T5 T1 T2 T3 T5T4 T6 mID: m1
  75. 75. Fencing zombie ● Bump generation number after each rebalance 75Coordinator T3T1 T2 gen: 1 m1, m2 Members mID: m1 T6T4 T5 gen: 1 mID: m2 T1 T2 T3 T5T4 T6 Generation 1
  76. 76. Fencing zombie ● Bump generation number after each rebalance 1. Member m1 rejoins group 76Coordinator T3T1 T2 gen: 1 m1, m2 Members mID: m1 T6T4 T5 gen: 1 mID: m2 T1 T2 T3 T5T4 T6 Generation 1
  77. 77. Fencing zombie ● Bump generation number after each rebalance 1. Member m1 rejoins group 2. All members rejoin/revoke tasks 77Coordinator T3T1 T2 gen: 1 m1, m2 Members mID: m1 T6T4 T5 gen: 1 mID: m2 T1 T2 T3 T5T4 T6 Generation 1
  78. 78. Fencing zombie ● Bump generation number after each rebalance 1. Member m1 rejoins group 2. All members rejoin/revoke tasks 3. Bump generation number 78Coordinator T3T1 T2 gen: 1 m1, m2 Members mID: m1 T6T4 T5 gen: 1 mID: m2 T1 T2 T3 T5T4 T6 Generation 1 Generation 2
  79. 79. Fencing zombie ● Bump generation number after each rebalance 1. Member m1 rejoins group 2. All members rejoin/revoke tasks 3. Bump generation number 4. Perform assignment 79Coordinator T3T1 T2 gen: 1 m1, m2 Members mID: m1 T6T4 T5 gen: 1 mID: m2 T1 T2 T3 T5T4 T6 Generation 2
  80. 80. Fencing zombie ● Bump generation number after each rebalance 1. Member m1 rejoins group 2. All members rejoin/revoke tasks 3. Bump generation number 4. Perform assignment 5. Propagate, and Done! 6. Group currently stable at generation 2 80Coordinator T3T1 T2 gen: 2 m1, m2 Members mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 Generation 2
  81. 81. Fencing zombie 1. … 6. Group currently stable at generation 2 7. Rebalance triggers again 81Coordinator T3T1 T2 gen: 2 m1, m2 Members mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 Generation 2 Join callback
  82. 82. Fencing zombie 1. … 6. Group currently stable at generation 2 7. Rebalance triggers again 8. Member m1 rejoins 82Coordinator T3T1 T2 gen: 2 m1, m2 Members mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 Generation 2 <m1> Join callback
  83. 83. Fencing zombie 1. … 6. Group currently stable at generation 2 7. Rebalance triggers again 8. Member m1 rejoins 9. Member m2 has transient failure 83Coordinator T3T1 T2 gen: 2 m1, m2 Members mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 Generation 2 <m1> Join callback
  84. 84. Fencing zombie 1. … 6. Group currently stable at generation 2 7. Rebalance triggers again 8. Member m1 rejoins 9. Rebalance timeout reaches, kicking out m2 84Coordinator T3T1 T2 gen: 2 m1, m2 Members mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 Generation 2 <m1> Join callback
  85. 85. Fencing zombie 1. … 6. Group currently stable at generation 2 7. Rebalance triggers again 8. Member m1 rejoins 9. Rebalance timeout reaches, kicking out m2 10. Bump generation to 3 85Coordinator T3T1 T2 gen: 2 m1, m2 Members mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 Generation 2 Generation 3 <m1> Join callback
  86. 86. Fencing zombie 1. … 6. Group currently stable at generation 2 7. Rebalance triggers again 8. Member m1 rejoins 9. Rebalance timeout reaches, kicking out m2 10. Bump generation to 3 11. Perform assignment 86Coordinator T3T1 T2 gen: 2 m1, m2 Members mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 Generation 2 Generation 3 <m1> Join callback
  87. 87. Fencing zombie 1. … 6. Group currently stable at generation 2 7. Rebalance triggers again 8. Member m1 rejoins 9. Rebalance timeout reaches, kicking out m2 10. Bump generation to 3 11. Perform assignment 12. Propagate, and done! 13. Group stable at generation 3 87Coordinator gen: 3 mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 m1 Members Generation 3 Join callback T6T1 T2 …
  88. 88. Fencing zombie 1. … 11. Group stable at generation 3 12. Member m2 rejoins in a zombie mode 88Coordinator gen: 3 mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 Join callback … m1 Members Generation 3 T6T1 T2 …
  89. 89. Fencing zombie 1. … 11. Group stable at generation 3 12. Member m2 rejoins in a zombie mode 13. Fenced by mismatched generation 89Coordinator gen: 3 mID: m1 T6T4 T5 gen: 2 mID: m2 T1 T2 T3 T5T4 T6 Join callback … X m1 Members Generation 3 T6T1 T2 …
  90. 90. Fencing zombie 1. … 11. Group stable at generation 3 12. Member m2 rejoins in a zombie mode 13. Fenced by mismatched generation 14. Reset local generation info 90Coordinator gen: 3 mID: m1 gen: -- mID: -- T1 T2 T3 T5T4 T6 Join callback … m1 Members Generation 3 T6T1 T2 …
  91. 91. Fencing zombie 1. … 11. Group stable at generation 3 12. Member m2 rejoins in a zombie mode 13. Fenced by mismatched generation 14. Reset local generation info 15. Rejoin as unknown member without generation 91Coordinator gen: 3 mID: m1 gen: -- mID: -- T1 T2 T3 T5T4 T6 Join callback … m1 Members Generation 3 T6T1 T2 …
  92. 92. Fencing zombie 1. … 11. Group stable at generation 3 12. Member m2 rejoins in a zombie mode 13. Fenced by mismatched generation 14. Reset local generation info 15. Rejoin as unknown member without generation 16. Registered as m3 17. Group transits to rebalance 92Coordinator gen: 3 mID: m1 gen: -- mID: -- T1 T2 T3 T5T4 T6 … m1, m3 Members Generation 3 <m3> <m1> Join callback T6T1 T2 …
  93. 93. Fencing zombie 1. … 17. Group transits to rebalance 18. Bump generation to 4 93Coordinator gen: 3 mID: m1 gen: -- mID: -- T1 T2 T3 T5T4 T6 … m1, m3 Members Generation 3 Generation 4 <m3> <m1> Join callback T6T1 T2 …
  94. 94. Fencing zombie 1. … 17. Group transits to rebalance 18. Bump generation to 4 19. Perform assignment 94Coordinator gen: 3 mID: m1 gen: -- mID: -- T1 T2 T3 T5T4 T6 … m1, m3 Members Generation 3 Generation 4 <m3> <m1> Join callback T6T1 T2 …
  95. 95. Fencing zombie 1. … 17. Group transits to rebalance 18. Bump generation to 4 19. Perform assignment 20. Propagate, and done! 95Coordinator gen: 4 mID: m1 gen: 4 mID: m3 m1, m3 Members Generation 4 Join callback T3T1 T2 T6T4 T5 T1 T2 T3 T5T4 T6
  96. 96. 96 96 96 How to design a rebalance protocol? 1. Membership changes: (a) Member joins/leaves the group (b) Member times out (c) Zombie member fencing 2. Assignor changes 3. Task changes
  97. 97. 97 97 97 Should we do use broker or client as the assignor?
  98. 98. Do assignment on broker 1. Stable with range assignment 98Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 Range
  99. 99. Do assignment on broker 1. Stable with range assignment 2. Redeploy coordinator to use RR 99Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 Range Round Robin
  100. 100. Do assignment on broker 1. Stable with range assignment 2. Redeploy coordinator to use RR 3. Coordinator bounce completes 100Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 Round Robin
  101. 101. Do assignment on broker 1. Stable with range assignment 2. Redeploy coordinator to use RR 3. Coordinator bounce completes ● Not an ideal approach ○ Restart stateful service ○ Affect other clients ○ Data protocol consistency 101Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 Round Robin
  102. 102. Do assignment on client 1. Designated leader member 102Coordinator T3T1 T2 T6T4 T5 T8T7 T1 T2 T5T4 T7 T8 T3 T6 Range
  103. 103. 1. Designated leader member 2. Redeploy leader to use RR 103Coordinator T3T1 T2 T6T4 T5 T8T7 T1 T2 T5T4 T7 T8 T3 T6 Range Round Robin Do assignment on client
  104. 104. Do assignment on client 1. Designated leader member 2. Redeploy leader to use RR 3. Leader restarted 104Coordinator T3T1 T2 T6T4 T5 T8T7 T1 T2 T5T4 T7 T8 T3 T6 Round Robin
  105. 105. Do assignment on client 1. Designated leader member 2. Redeploy leader to use RR 3. Leader restarted 4. Leader asks coordinator to rebalance 105Coordinator T6T4 T5 T8T7 T1 T2 T5T4 T7 T8 T3 T6 Round Robin T3T1 T2
  106. 106. Do assignment on client 1. Designated leader member 2. Redeploy leader to use RR 3. Leader restarted 4. Leader asks coordinator to rebalance 5. Coordinator requires members to revoke tasks/rejoin 106Coordinator T3T1 T2 T6T4 T5 Round Robin T1 T2 T5T4 T7 T8 T3 T6 T8T7
  107. 107. Do assignment on client 1. Designated leader member 2. Redeploy leader to use RR 3. Leader restarted 4. Leader asks coordinator to rebalance 5. Coordinator requires members to revoke tasks/rejoin 107Coordinator T3T1 T2 T6T4 T5 Round Robin T1 T2 T5T4 T7 T8 T3 T6 T8T7
  108. 108. Do assignment on client 1. Designated leader member 2. Redeploy leader to use RR 3. Leader restarted 4. Leader asks coordinator to rebalance 5. Coordinator requires members to revoke tasks/rejoin 6. Coordinator inform leader all the members rejoined 108Coordinator T3T1 T2 T6T4 T5 T8T7 T1 T2 T5T4 T7 T8 T3 T6 Round Robin
  109. 109. Do assignment on client 1. … 5. Coordinator requires members to revoke tasks/rejoin 6. Coordinator inform leader all the members rejoined 7. Leader performs assignment 109Coordinator T3T1 T2 T6T4 T5 T8T7 T1 T2 T5T4 T7 T8 T3 T6 Round Robin
  110. 110. Do assignment on client 1. … 5. Coordinator requires members to revoke tasks/rejoin 6. Coordinator inform leader all the members rejoined 7. Leader performs assignment 8. Leader calls sync group to send back assignment 110Coordinator T3T1 T2 T6T4 T5 T8T7 T1 T2 T5T4 T7 T8 T3 T6 Round Robin
  111. 111. Do assignment on client 1. … 5. Coordinator requires members to revoke tasks/rejoin 6. Coordinator inform leader all the members rejoined 7. Leader performs assignment 8. Leader calls sync group to send back assignment 9. Coordinator propagates… 10. Done! 111Coordinator T7T1 T4 T8T2 T5 T6T3 T1 T2 T5T4 T7 T8 T3 T6
  112. 112. 112 112 112 How to design a rebalance protocol? 1. Membership changes: (a) Member joins/leaves the group (b) Member times out (c) Zombie member fencing 2. Assignor changes 3. Task changes
  113. 113. Catch task change 113Coordinator T4T1 T5T2 T6T3 T1 T2 T3 T5T4 T6
  114. 114. Catch task change 1. Add two new tasks 114Coordinator T4T1 T5T2 T6T3 T1 T2 T3 T5T4 T6 T7 T8
  115. 115. Catch task change 1. Add two new tasks 2. Revoke all members’ active tasks 3. Require members to rejoin 115Coordinator T4T1 T5T2 T6T3 T1 T2 T3 T5T4 T6 T7 T8
  116. 116. Catch task change 1. Add two new tasks 2. Revoke all members’ active tasks 3. Require members to rejoin 4. Perform assignment 5. Propagate to members 6. Done! 116Coordinator T4T1 T5T2 T6T3 T1 T2 T3 T5T4 T6 T7 T8
  117. 117. Catch task change 1. Add two new tasks 2. Revoke all members’ active tasks 3. Require members to rejoin 117Coordinator T4T1 T5T2 T6T3 T1 T2 T3 T5T4 T6 T7 T8
  118. 118. Catch task change 1. Add two new tasks 2. Revoke all members’ active tasks 3. Require members to rejoin 4. Perform assignment 118Coordinator T4T1 T5T2 T6T3 T1 T2 T3 T5T4 T6 T7 T8
  119. 119. Catch task change 1. Add two new tasks 2. Revoke all members’ active tasks 3. Require members to rejoin 4. Perform assignment 5. Propagate to members 119Coordinator T5T2 T6T3 T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2
  120. 120. Catch task change 1. Add two new tasks 2. Revoke all members’ active tasks 3. Require members to rejoin 4. Perform assignment 5. Propagate to members 120Coordinator T6T3 T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5
  121. 121. Catch task change 1. Add two new tasks 2. Revoke all members’ active tasks 3. Require members to rejoin 4. Perform assignment 5. Propagate to members 6. Done! 121Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7
  122. 122. 122 122 122 Recap: 1. Membership changes: (a) Member joins/leaves the group (b) Member times out (c) Zombie member fencing 2. Assignor changes 3. Task changes
  123. 123. 123 123 123 Congratulations! You have walked through all the necessary parts of a rebalance algorithm! Now let’s take a systematic view of it.
  124. 124. 124 124 State Machine View: Two-phase protocol Stable
  125. 125. 125 125 State Machine View: Two-phase protocol RebalanceStable Rebalance condition triggered
  126. 126. 126 126 State Machine View: Two-phase protocol RebalanceStable Rebalance condition triggered Sync All current members join/ Rebalance timeout
  127. 127. 127 127 State Machine View: Two-phase protocol RebalanceStable Rebalance condition triggered Sync All current members join/ Rebalance timeout Bump generation
  128. 128. 128 128 State Machine View: Two-phase protocol RebalanceStable Rebalance condition triggered Sync All current members join/ Rebalance timeout Leader sends back the assignment RebalanceStable Sync Bump generation
  129. 129. 129 129 State Machine View: Two-phase protocol RebalanceStable Sync Leader sends back the assignment Rebalance condition triggered All current members join/ Rebalance timeout Rebalance condition triggered Bump generation
  130. 130. 130 130 130 Rebalance is helpful, but sometimes harmful.
  131. 131. 131 131 131 Rebalance is helpful, but sometimes harmful. 1. transient failure
  132. 132. 132 132 132 Rebalance is helpful, but sometimes harmful. 1. transient failure 2. rolling bounce
  133. 133. Transient unavailability 133Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7
  134. 134. Transient unavailability 1. One member couldn’t connect to coordinator 134Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7
  135. 135. Transient unavailability 1. One member couldn’t connect to coordinator 2. Session timeout reaches 135Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7
  136. 136. Transient unavailability 1. One member couldn’t connect to coordinator 2. Session timeout reaches 3. Require other members to revoke tasks/rejoin 136Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7T3T1 T2 T6T4 T5
  137. 137. Transient unavailability 1. One member couldn’t connect to coordinator 2. Session timeout reaches 3. Require other members to revoke tasks/rejoin 4. Perform Assignment 137Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7T3T1 T2 T6T4 T5
  138. 138. Transient unavailability 1. One member couldn’t connect to coordinator 2. Session timeout reaches 3. Require other members to revoke tasks/rejoin 4. Perform Assignment 5. Propagate… 6. Done! However one member becomes zombie now 138Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7T3T1 T2 T6T4 T5 T7 T8
  139. 139. Transient unavailability 1. … 6. Done! However one member becomes zombie now 7. Zombie member rejoins 139Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7T3T1 T2 T6T4 T5 T7 T8
  140. 140. Transient unavailability 1. … 6. Done! However one member becomes zombie now 7. Zombie member rejoins 8. Zombie resets generation and rejoins 140Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T7 T8
  141. 141. Transient unavailability 1. … 6. Done! However one member becomes zombie now 7. Zombie member rejoins 8. Zombie resets generation and rejoins 9. Coordinator requires all members to revoke tasks/rejoin 141Coordinator T1 T2 T3 T5T4 T6 T7 T8 T6T4 T5T3T1 T2 T7 T8
  142. 142. Transient unavailability 1. … 6. Done! However one member becomes zombie now 7. Zombie member rejoins 8. Zombie resets generation and rejoins 9. Coordinator requires all members to revoke tasks/rejoin 10. Perform assignment (different from last time) 142Coordinator T1 T2 T7 T5T4 T8 T3 T6 T6T4 T5T3T1 T2 T7 T8
  143. 143. Transient unavailability 1. … 6. Done! However one member becomes zombie now 7. Zombie member rejoins 8. Zombie resets generation and rejoins 9. Coordinator requires all members to revoke tasks/rejoin 10. Perform assignment (different from last time) 11. Propagate, and done! 143Coordinator T1 T2 T7 T5T4 T8 T3 T6 T6T3T7T1 T2 T8T4 T5
  144. 144. Transient unavailability ● An unnecessary assignment change 144Coordinator T1 T2 T7 T5T4 T8 T3 T6 T6T3T7T1 T2 T8T4 T5 T3T1 T2 T6T4 T5 T8T7
  145. 145. Transient unavailability ● An unnecessary assignment change ● Solution: ○ Increase session.timeout.ms 145Coordinator T1 T2 T7 T5T4 T8 T3 T6 T6T3T7T1 T2 T8T4 T5 T3T1 T2 T6T4 T5 T8T7
  146. 146. Transient unavailability ● An unnecessary assignment change ● Solution: ○ Increase session.timeout.ms 146Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7
  147. 147. Transient unavailability ● An unnecessary assignment change ● Solution: ○ Increase session.timeout.ms 147Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7
  148. 148. Transient unavailability ● An unnecessary assignment change ● Solution: ○ Increase session.timeout.ms ○ No rebalance triggered 148Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7
  149. 149. Transient unavailability ● An unnecessary assignment change ● Solution: ○ Increase session.timeout.ms ○ No rebalance triggered ○ Trade-off assignment stickiness vs availability 149Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7
  150. 150. 150 150 150 Rebalance is helpful, but sometimes harmful. 1. transient failure 2. rolling bounce
  151. 151. Rolling bounce 151Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 ID: m1 ID: m2 ID: m3 m1, m2, m3 Members
  152. 152. Rolling bounce 1. Restart member fleet 152Coordinator T1 T2 T3 T5T4 T6 T7 T8 ID: -- ID: -- ID: -- m1, m2, m3 Members T3T1 T2 T6T4 T5 T8T7
  153. 153. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 153Coordinator T1 T2 T3 T5T4 T6 T7 T8 T6T4 T5T3T1 T2 T8T7 ID: -- ID: -- ID: -- Members [ ]
  154. 154. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 154Coordinator T1 T2 T3 T5T4 T6 T7 T8 T6T4 T5T3T1 T2 T8T7 ID: -- ID: -- ID: -- Members m4
  155. 155. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 155Coordinator T1 T2 T3 T5T4 T6 T7 T8 T6T4 T5T3T1 T2 T8T7 ID: -- ID: -- ID: -- Members m4, m5
  156. 156. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 156Coordinator T1 T2 T3 T5T4 T6 T7 T8 T6T4 T5T3T1 T2 T8T7 ID: -- ID: -- ID: -- Members m4, m5, m6
  157. 157. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 4. Member assignment gets shuffled 157Coordinator T1 T2 T3 T5T4 T6 T7 T8 T6T4 T5T3T1 T2 T8T7 m4, m5, m6 Members ID: -- ID: -- ID: --
  158. 158. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 4. Member assignment gets shuffled 158Coordinator T7 T4 T3 T5T2 T8 T1 T6 T6T4 T5T3T1 T2 T8T7 ID: -- ID: -- ID: -- m4, m5, m6 Members
  159. 159. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 4. Member assignment gets shuffled 5. Perform assignment, and new member id 159Coordinator T7 T4 T3 T5T2 T8 T1 T6 T6T4 T5T3T1 T2 T8T7 ID: -- ID: -- ID: -- m4, m5, m6 Members
  160. 160. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 4. Member assignment gets shuffled 5. Perform assignment, and new member id 160Coordinator T7 T4 T3 T5T2 T8 T1 T6 T6T4 T5 T8T7 ID: m4 ID: -- ID: -- m4, m5, m6 Members T7T3 T4
  161. 161. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 4. Member assignment gets shuffled 5. Perform assignment, and new member.id 161Coordinator T7 T4 T3 T5T2 T8 T1 T6 T8T7 ID: m4 ID: m5 ID: -- m4, m5, m6 Members T7T3 T4 T8T2 T5
  162. 162. Rolling bounce 1. Restart member fleet 2. Some member sends leave group request 3. Members rejoin 4. Member assignment gets shuffled 5. Perform assignment, and new member.id 6. Propagate… 7. Done! 162Coordinator T7 T3 T4 T5T2 T8 T1 T6 T6T1T7T3 T4 T8T2 T5 ID: m4 ID: m5 ID: m6 m4, m5, m6 Members
  163. 163. Rolling bounce ● Another unnecessary assignment change 163Coordinator T7 T3 T4 T5T2 T8 T1 T6 T6T1T7T3 T4 T8T2 T5 T3T1 T2 T6T4 T5 T8T7
  164. 164. Rolling bounce ● Another unnecessary assignment change ● No persistence of member identity. After restart, the member is unknown to the coordinator. 164Coordinator T7 T3 T4 T5T2 T8 T1 T6 T6T1T7T3 T4 T8T2 T5 T3T1 T2 T6T4 T5 T8T7 m1, m2, m3 m4, m5, m6 Members
  165. 165. 165 165 165 Look into the future: static membership
  166. 166. 166 166 166 Look into the future: static membership 1. Unique id for each member
  167. 167. 167 167 167 Look into the future: static membership 1. Unique id for each member 2. Enlarge session timeout to make it effective
  168. 168. 168 168 168 Look into the future: static membership 1. Unique id for each member 2. Enlarge session timeout to make it effective 3. No rebalance if just doing rolling bounce
  169. 169. 169 169 169 Look into the future: static membership 1. Unique id for each member 2. Enlarge session timeout to make it effective 3. No rebalance if just doing rolling bounce 4. Great if work with K8s
  170. 170. Static membership 170Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7
  171. 171. Static membership ● Give each member a unique id ○ Config: group.instance.id 171Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 ID: w1 ID: w2 ID: w3
  172. 172. Static membership ● Give each member a unique id ○ Config: group.instance.id ○ Remember assignment info on coordinator 172Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 ID: w1 ID: w2 ID: w3
  173. 173. Static membership ● Give each member a unique id ○ Config: group.instance.id ○ Remember assignment info on coordinator ○ Static member never sends leave group request 173Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 ID: w1 ID: w2 ID: w3
  174. 174. Static membership ● Give each member a unique id ○ Config: group.instance.id ○ Remember assignment info on coordinator ○ Static member never sends leave group request ○ No rebalance upon known static member rejoin 174Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 ID: w1 ID: w2 ID: w3
  175. 175. Static membership 1. Restart member fleet 175Coordinator T1 T2 T3 T5T4 T6 T7 T8 T3T1 T2 T6T4 T5 T8T7 ID: w1 ID: w2 ID: w3
  176. 176. Static membership 1. Restart member fleet 2. Member w1 rejoins 176Coordinator T1 T2 T3 T5T4 T6 T7 T8 T6T4 T5 T8T7 ID: w1 ID: w2 ID: w3 T3T1 T2
  177. 177. Static membership 1. Restart member fleet 2. Member w1 rejoins 3. Coordinator gets w1’s assignment 177Coordinator T1 T2 T3 T5T4 T6 T7 T8 T6T4 T5 T8T7 ID: w1 ID: w2 ID: w3 T3T1 T2
  178. 178. Static membership 1. Restart member fleet 2. Member w1 rejoins 3. Coordinator gets w1’s assignment 4. Member w1 gets the same assignment 178Coordinator T1 T2 T3 T5T4 T6 T7 T8 T6T4 T5 T8T7 ID: w1 ID: w2 ID: w3 T3T1 T2
  179. 179. Static membership 1. … 5. Member w2 rejoins 179Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7 ID: w1 ID: w2 ID: w3 T3T1 T2 T6T4 T5
  180. 180. Static membership 1. … 5. Member w2 rejoins 6. Coordinator gets w2’s assignment 180Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7 ID: w1 ID: w2 ID: w3 T3T1 T2 T6T4 T5
  181. 181. Static membership 1. … 5. Member w2 rejoins 6. Coordinator gets w2’s assignment 7. Member w2 gets the same assignment 181Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7 ID: w1 ID: w2 ID: w3 T3T1 T2 T6T4 T5
  182. 182. Static membership 1. … 8. Member w3 rejoins 182Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7 ID: w1 ID: w2 ID: w3 T3T1 T2 T6T4 T5
  183. 183. Static membership 1. … 8. Member w3 rejoins 9. Coordinator gets w3’s assignment 183Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7 ID: w1 ID: w2 ID: w3 T3T1 T2 T6T4 T5
  184. 184. Static membership 1. … 8. Member w3 rejoins 9. Coordinator gets w3’s assignment 10. Member w3 gets the same assignment 11. Done! 184Coordinator T1 T2 T3 T5T4 T6 T7 T8 T8T7 ID: w1 ID: w2 ID: w3 T3T1 T2 T6T4 T5
  185. 185. 185 185 185 Oops! I configured duplicate instances!
  186. 186. Fencing conflict instance ● Maintain a mapping from instance id to member id 186Coordinator T3T1 T2 T6T4 T5 gID: w1 w1 -> m1, w2 -> m2 Members mID: m1 gID: w2 mID: m2 T1 T2 T3 T5T4 T6
  187. 187. Fencing conflict instance ● Maintain a mapping from instance id to member id ● Update member id when a known instance rejoins 187Coordinator T3T1 T2 T6T4 T5 gID: w1 w1 -> m1, w2 -> m2 Members mID: m1 gID: w2 mID: m2 T1 T2 T3 T5T4 T6
  188. 188. Fencing conflict instance ● Maintain a mapping from instance id to member id ● Update member id when a known instance rejoins 1. One conflict member joins 188Coordinator T3T1 T2 T6T4 T5 gID: w1 w1 -> m1, w2 -> m2 Members mID: m1 gID: w2 mID: m2 gID: w2 mID: -- T1 T2 T3 T5T4 T6
  189. 189. Fencing conflict instance ● Maintain a mapping from instance id to member id ● Update member id when a known instance rejoins 1. One conflict member joins 2. Update w2’s member id to m3 189Coordinator T3T1 T2 T6T4 T5 gID: w1 w1 -> m1, w2 -> m2 m3 Members mID: m1 gID: w2 mID: m2 gID: w2 mID: -- T1 T2 T3 T5T4 T6
  190. 190. Fencing conflict instance ● Maintain a mapping from instance id to member id ● Update member id when a known instance rejoins 1. One conflict member joins 2. Update w2’s member id to m3 3. Old member m2 call heartbeat 190Coordinator T3T1 T2 T6T4 T5 gID: w1 w1 -> m1, w2 -> m2 m3 Members mID: m1 gID: w2 mID: m2 gID: w2 mID: m3 T6T4 T5 T1 T2 T3 T5T4 T6 hb(m2)
  191. 191. Fencing conflict instance ● Maintain a mapping from instance id to member id ● Update member id when a known instance rejoins 1. One conflict member joins 2. Update w2’s member id to m3 3. Old member m2 call heartbeat() 4. Member m2 will be fenced since w2’s member id != m2 191Coordinator T3T1 T2 T6T4 T5 gID: w1 w1 -> m1, w2 -> m2 m3 Members mID: m1 gID: w2 mID: m2 gID: w2 mID: m3 T6T4 T5 T1 T2 T3 T5T4 T6 hb(m2)
  192. 192. Fencing conflict instance ● Maintain a mapping from instance id to member id ● Update member id when a known instance rejoins 1. One conflict member joins 2. Update w2’s member id to m3 3. Old member m2 call heartbeat() 4. Member m2 will be fenced since w2’s member id != m2 5. Immediately crash m2 192Coordinator T3T1 T2 T6T4 T5 gID: w1 w1 -> m1, w2 -> m2 m3 Members mID: m1 gID: w2 mID: m2 gID: w2 mID: m3 T6T4 T5 T1 T2 T3 T5T4 T6 hb(m2)
  193. 193. Fencing conflict instance 1. One conflict member joins 2. Update w2’s member id to m3 3. Old member m2 call heartbeat() 4. Member m2 will be fenced since w2’s member id != m2 5. Immediately crash m2 6. Group keeps stable 193Coordinator T3T1 T2 gID: w1 w1 -> m1, w2 -> m3 Members mID: m1 gID: w2 mID: m3 T6T4 T5 T1 T2 T3 T5T4 T6
  194. 194. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 194Coordinator T3T1 T2 gID: w1 mID: m1 T1 T2 T3 w1 -> m1 Members
  195. 195. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. First member joins with id w2 195Coordinator gID: w1 w1 -> m1 w2 -> m2 Members mID: m1 T1 T2 T3 gID: w2 mID: -- T3T1 T2
  196. 196. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. First member joins with id w2 2. In the meantime, a conflict w2 joins 196Coordinator gID: w1 w1 -> m1 w2 -> m2 Members mID: m1 T1 T2 T3 gID: w2 mID: -- T3T1 T2 gID: w2 mID: --
  197. 197. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. First member joins with id w2 2. In the meantime, a conflict w2 joins 3. Replace w2’s member id to m3 197Coordinator gID: w1 w1 -> m1 w2 -> m2, m3 Members mID: m1 T1 T2 T3 gID: w2 mID: -- T3T1 T2 gID: w2 mID: --
  198. 198. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. First member joins with id w2 2. In the meantime, a conflict w2 joins 3. Replace w2’s member id to m3 4. Coordinator requires w1 to rejoin 198Coordinator gID: w1 w1 -> m1 w2 -> m2 m3 Members mID: m1 T1 T2 T3 gID: w2 mID: -- gID: w2 mID: -- T3T1 T2
  199. 199. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. First member joins with id w2 2. In the meantime, a conflict w2 joins 3. Replace w2’s member id to m3 4. Coordinator requires w1 to rejoin 5. Group performs assignment 199Coordinator gID: w1 w1 -> m1 w2 -> m2 m3 Members mID: m1 gID: w2 mID: -- gID: w2 mID: -- T3T1 T2 T1 T2 T3
  200. 200. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. First member joins with id w2 2. In the meantime, a conflict w2 joins 3. Replace w2’s member id to m3 4. Coordinator requires w1 to rejoin 5. Group performs assignment 6. Propagate new assignment 200Coordinator gID: w1 w1 -> m1 w2 -> m3 Members mID: m1 gID: w2 mID: -- gID: w2 mID: -- T1 T2 T1 T2 T3
  201. 201. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. First member joins with id w2 2. In the meantime, a conflict w2 joins 3. Replace w2’s member id to m3 4. Coordinator requires w1 to rejoin 5. Group performs assignment 6. Propagate new assignment 7. Done! 201Coordinator gID: w1 w1 -> m1 w2 -> m3 Members mID: m1 gID: w2 mID: -- gID: w2 mID: m3 T1 T2 T1 T2 T3 T3
  202. 202. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. … 8. Out of scope member times out, rejoining 202Coordinator gID: w1 w1 -> m1 w2 -> m3 Members mID: m1 gID: w2 mID: -- gID: w2 mID: m3 T1 T2 T1 T2 T3 T3
  203. 203. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. … 8. Out of scope member times out, rejoining 9. Update w2’s member id to m4 203Coordinator gID: w1 w1 -> m1 w2 -> m3 m4 Members mID: m1 gID: w2 mID: -- gID: w2 mID: m3 T1 T2 T1 T2 T3 T3
  204. 204. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. … 8. Out of scope member times out, rejoining 9. Update w2’s member id to m4 10. Get conflict assignment 204Coordinator gID: w1 w1 -> m1 w2 -> m3 m4 Members mID: m1 gID: w2 mID: m4 gID: w2 mID: m3 T1 T2 T1 T2 T3 T3T3
  205. 205. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. … 8. Out of scope member times out, rejoining 9. Update w2’s member id to m4 10. Get conflict assignment 11. Old member m3 calls heartbeat() 205Coordinator gID: w1 w1 -> m1 w2 -> m3 m4 Members mID: m1 gID: w2 mID: m4 gID: w2 mID: m3 T1 T2 T1 T2 T3 T3T3 hb(m3)
  206. 206. Fencing conflict instance (Nice to have)● A caveat for concurrent joining 1. … 8. Out of scope member times out, rejoining 9. Update w2’s member id to m4 10. Get conflict assignment 11. Old member m3 calls heartbeat() 12. Mismatch member id, fencing m3 206Coordinator gID: w1 w1 -> m1 w2 -> m3 m4 Members mID: m1 gID: w2 mID: m4 gID: w2 mID: m3 T1 T2 T1 T2 T3 T3T3 hb(m3)
  207. 207. Fencing conflict instance (Nice to have)● A caveat for concurrent joining ● Downsides: ○ Risk of concurrent processing ○ Delayed conflict detection 207Coordinator gID: w1 w1 -> m1 w2 -> m3 m4 Members mID: m1 gID: w2 mID: m4 gID: w2 mID: m3 T1 T2 T1 T2 T3 T3T3 hb(m3)
  208. 208. Fencing conflict instance (Nice to have)● Fence against callback 208Coordinator T3T1 T2 gID: w1 mID: m1 T1 T2 T3 w1 -> m1 MembersJoin callback
  209. 209. Fencing conflict instance (Nice to have)● Fence against callback 1. New member with id w2 joins, registering a callback 209Coordinator gID: w1 w1 -> m1 w2 -> m2 Members mID: m1 T1 T2 T3 gID: w2 mID: -- <w2, m2> Join callback T3T1 T2
  210. 210. Fencing conflict instance (Nice to have)● Fence against callback 1. New member with id w2 joins, registering a callback 2. A conflict member joins at the same time 210Coordinator gID: w1 w1 -> m1 w2 -> m2 Members mID: m1 T1 T2 T3 gID: w2 mID: -- gID: w2 mID: -- <w1, m1> <w2, m2> Join callback T3T1 T2
  211. 211. Fencing conflict instance (Nice to have)● Fence against callback 1. New member with id w2 joins, registering a callback 2. A conflict member joins at the same time 3. Replace member id to m3 211Coordinator gID: w1 w1 -> m1 w2 -> m2, m3 Members mID: m1 T1 T2 T3 gID: w2 mID: -- gID: w2 mID: -- <w1, m1> <w2, m2> Join callback T3T1 T2
  212. 212. Fencing conflict instance (Nice to have)● Fence against callback 1. New member with id w2 joins, registering a callback 2. A conflict member joins at the same time 3. Replace member id to m3 4. Return m2 callback with fenced exception 212Coordinator gID: w1 w1 -> m1, w2 -> m3 Members mID: m1 T1 T2 T3 gID: w2 mID: -- gID: w2 mID: -- <w1, m1> <w2, m2> X Join callback T3T1 T2
  213. 213. Fencing conflict instance (Nice to have)● Fence against callback 1. New member with id w2 joins, registering a callback 2. A conflict member joins at the same time 3. Replace member id to m3 4. Return m2 callback with fenced exception 5. Shutdown m2 immediately 213Coordinator gID: w1 w1 -> m1, w2 -> m3 Members mID: m1 T1 T2 T3 gID: w2 mID: -- gID: w2 mID: -- <w1, m1> <w2, m2> X Join callback T3T1 T2
  214. 214. Fencing conflict instance (Nice to have)● Fence against callback 1. … 6. Require all members to revoke/rejoin 214Coordinator gID: w1 w1 -> m1 w2 -> m3 Members mID: m1 gID: w2 mID: -- <w1, m1> <w2, m3> Join callback T3T1 T2 T1 T2 T3
  215. 215. Fencing conflict instance (Nice to have)● Fence against callback 1. … 6. Require all members to revoke/rejoin 7. Performs assignment 215Coordinator gID: w1 w1 -> m1 w2 -> m3 Members mID: m1 T1 T2 T3 gID: w2 mID: -- <w1, m1> <w2, m3> Join callback T3T1 T2
  216. 216. Fencing conflict instance (Nice to have)● Fence against callback 1. … 6. Require all members to revoke/rejoin 7. Performs assignment 8. Propagate through callbacks, done! Coordinator gID: w1 w1 -> m1 w2 -> m3 Members mID: m1 gID: w2 mID: -- Join callback T3T3T2 T1 T2 T3
  217. 217. 217 217 217 Lastly, some debug tips
  218. 218. 218 218 Developer debugging …
  219. 219. 219 219 Developer debugging … Find your server!
  220. 220. 220 220 Developer debugging 1. Log in your client application … Find your server!
  221. 221. 221 221 Developer debugging [2019-06-14 00:23:47,020] INFO [Consumer instanceId=consumer-A-2, clientId=StaticMemberTestClient-019a5efe-87ef- 4c62-9891-27330df67049-StreamThread-2- consumer, groupId=StaticMemberTestClient] Discovered group coordinator ducker04:9092 (id: 2147483645 rack: null) (org.apache.kafka.clients.consumer.internals.Abst ractCoordinator) 1. Log in your client application 2. Look into client log and search for “Discovered group coordinator”
  222. 222. 222 222 Developer debugging 1. Log in your client application 2. Look into client log and search for “Discovered group coordinator” 3. Find your server and log in … ducker04 Find your server!
  223. 223. 223 223 Developer debugging [2019-06-14 00:23:47,389] INFO [GroupCoordinator 2]: Preparing to rebalance group StaticMemberTestClient in state PreparingRebalance with old generation 0 (__consumer_offsets-2) (reason: Adding new member consumer-A-1-1560471827287 with group instanceid Some(consumer-A-1)) (kafka.coordinator.group.GroupCoordinator) 1. Log in your client application 2. Look into client log and search for “Discovered group coordinator” 3. Find your server and log in 4. Check server log for “rebalance reason”
  224. 224. 224 224 Developer debugging Server metrics: ● NumGroupsPreparing Rebalance ● NumGroupsCompleti ngRebalance ● NumGroupsStable ● NumGroupsDead ● NumGroupsEmpty …
  225. 225. 225 225 Developer debugging Client metrics: ● Join-rate/total ● Join-time-avg/max ● Sync-rate/total ● Sync-time-avg/max ● Assigned-partitions ● Commit-rate/total ● Heartbeat-rate/total …
  226. 226. 226 226 226226 Takeaways
  227. 227. 227 227 227227 ● Different timeouts: ○ Enlarge your session.timeout.ms to achieve better stability ○ max.poll.interval.ms is the tolerance of member poll efficiency ○ rebalance.timeout.ms will kick out unjoined members when due Takeaways
  228. 228. 228 228 228228 ● Different timeouts: ○ Enlarge your session.timeout.ms to achieve better stability ○ max.poll.interval.ms is the tolerance of member poll efficiency ○ rebalance.timeout.ms will kick out unjoined members when due ● What is group generation? Takeaways
  229. 229. 229 229 229229 ● Different timeouts: ○ Enlarge your session.timeout.ms to achieve better stability ○ max.poll.interval.ms is the tolerance of member poll efficiency ○ rebalance.timeout.ms will kick out unjoined members when due ● What is group generation? ● Why we let the client do assignment? Takeaways
  230. 230. 230 230 230230 ● Different timeouts: ○ Enlarge your session.timeout.ms to achieve better stability ○ max.poll.interval.ms is the tolerance of member poll efficiency ○ rebalance.timeout.ms will kick out unjoined members when due ● What is group generation? ● Why we let the client do assignment? ● Static membership is generally available in AK 2.3, for Consumer and Streams ○ Upgrade your broker to 2.3 ○ Set unique group.instance.id for your client (monitoring fencing) ○ Make session timeout long enough Takeaways
  231. 231. 231 Resources • KIP-62: Allow consumer to send heartbeats from a background thread • KIP-180: Add a broker metric specifying the number of consumer group rebalances in progress • KIP-345: Introduce static membership protocol to reduce consumer rebalances (accepted) • Kafka Client redesign proposal • "The Magical Rebalance Protocol of Apache Kafka" by Gwen Shapira (Strange Loop Talk, Sep 2018) https://www.youtube.com/watch?v=MmLezWRI3Ys&t=8s
  232. 232. 232 232 232 Special thanks to Guozhang Wang, Jason Gustafson, Liquan Pei and Matthias J Sax
  233. 233. 233 233 KS19Meetup. CONFLUENT COMMUNITY DISCOUNT CODE 25% OFF* *Standard Priced Conference pass

×