Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Consumer offset management in Kafka

61 988 vues

Publié le

Overview of consumer offset management in Kafka presented at Kafka meetup @ LinkedIn. (March 24, 2015)

Publié dans : Données & analyses
  • Identifiez-vous pour voir les commentaires

Consumer offset management in Kafka

  1. 1. Consumer offset management in Kafka Joel Koshy Kafka meetup @ LinkedIn March 24, 2015
  2. 2. Consumers and offsets 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 2 3 2 4 2 5 2 6 2 7 2 8 2 9 3 0 3 1 3 2 3 3 3 4 PageViewEvent-0 EmailBounceEvent-0 PageViewEvent-0 EmailBounceEvent-0 35 12 Offset map GroupId: audit-consumer
  3. 3. Store offsets in ZooKeeper admin brokers config consumers controller controller_epoch audit-consumer ids owners offsets PageViewEvent EmailBounceEvent 0 0 12 35
  4. 4. (Don’t) store offsets in ZooKeeper •  Heavy write-load on ZooKeeper •  Especially an issue – during 0.7 to 0.8 migration – and before we switched to SSDs •  Non-ideal work-arounds – Increase offset-commit intervals – Filter commits if offsets have not moved – Spread large offset commits over commit interval
  5. 5. Offset management (ideals) •  Durable •  Support high write-load •  Consistent reads •  Atomic offset commits •  Fast commits/fetches
  6. 6. Store offsets in a replicated log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets Next commit Group Offset Partition
  7. 7. Store offsets in a replicated log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets audit-consumer EmailBounceEvent-0 248
  8. 8. Store offsets in a replicated log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323
  9. 9. Store offsets in a replicated log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543
  10. 10. Store offsets in a replicated, partitioned log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets, partition 3 audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543 mirrormaker ClickEvent-1 54444 mirrormaker ClickEvent-1 54674 __consumer_offsets, partition 8 Partition è abs(GroupId.hashCode()) % NumPartitions
  11. 11. Store offsets in a replicated, partitioned log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets, partition 3 audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543 mirrormaker ClickEvent-1 54444 mirrormaker ClickEvent-1 54674 __consumer_offsets, partition 8 Offset commits append to the offsets topic partition Offset fetches read from the offsets topic partition
  12. 12. Store offsets in a replicated, partitioned log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets, partition 3 audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543 mirrormaker ClickEvent-1 54444 mirrormaker ClickEvent-1 54674 __consumer_offsets, partition 8 [audit-consumer, PageViewEvent-0] [audit-consumer, EmailBounceEvent-0] [mirrormaker, ClickEvent-0] [mirrormaker, ClickEvent-1] Offsets cache 323 248 54674 54543 Offset commits append to the offsets topic partition + update the cache Offset fetches read from the offsets topic partition cache
  13. 13. Store offsets in a replicated, partitioned log audit-consumer PageViewEvent-0 240 audit-consumer EmailBounceEvent-0 232 __consumer_offsets, partition 3 audit-consumer EmailBounceEvent-0 248 audit-consumer PageViewEvent-0 323 mirrormaker ClickEvent-0 54543 mirrormaker ClickEvent-1 54444 mirrormaker ClickEvent-1 54674 __consumer_offsets, partition 8 [audit-consumer, PageViewEvent-0] [audit-consumer, EmailBounceEvent-0] [mirrormaker, ClickEvent-0] [mirrormaker, ClickEvent-1] Offsets cache 323 248 54674 54543 Offset commits append to the offsets topic partition + update the cache Offset fetches read from the offsets topic partition cache How do we GC older offset entries?
  14. 14. log.cleanup.policy = compact 0 1 2 3 4 5 6 7 8 9 10 K1 K2 K1 K1 K3 K2 K4 K5 K5 K2 K6 V1 V2 V3 V4 V5 V6 V7 V8 V9 V 10 V 11 3 4 K1 K3 V4 V5 6 K4 V7 8 10 K5 K6 V9 V 11 Compaction Offset Key Value 11 K2 Ø Offset Key Value
  15. 15. Store offsets in a replicated, partitioned, compacted log audit-consumer PageViewEvent-0 126312342 audit-consumer EmailBounceEvent-0 59843 audit-consumer PageViewEvent-0 126319628 audit-consumer EmailBounceEvent-0 86243 audit-consumer PageViewEvent-0 126398102 Key Value audit-consumer EmailBounceEvent-0 86243 audit-consumer PageViewEvent-0 126398102 Compaction Key è [Group, Topic, Partition] Value è Offset
  16. 16. Dealing with dead consumers console-consumer-38587, console-consumer-94777, console-consumer-94774, console-consumer-31199, console-consumer-51555, console-consumer-43182, mobileServiceConsumerDwwewewA13dafddesfasdfdee33, console-consumer-57784, python-kafka-consumer-0959a04da7c241448beb0813f002e34b, console- consumer-70750, console-consumer-94809, console-consumer-87470, touch-me-not, console- consumer-43246, console-consumer-69811, python-kafka-consumer-82c2d653128840d5b6bcbfc5ac7f3abc, console-consumer-33847, console-consumer-18217, console-consumer-87493, console-consumer-26414, console-consumer-67299, voldemort-reader-jjkoshy, console-consumer-80245, kafka_listener_for_comments, test-flow-staging, console-consumer-8441, console-consumer-67258, data- processor-2, console-consumer-94869, console-consumer-55242, pinot-beta-hackday_1_2, console- consumer-6601, cloud-host1, system-metrics-monitor-01, console-consumer-70859, console- consumer-26477, page-view-test-flow-2, page-view-test-flow-1, python-kafka-consumer- bf33d075b22d4ddfb82d4a055303e909, console-consumer-99768, console-consumer-45509, console- consumer-21504, points-test_devel_l1_1686489164, console-consumer-14841, console-consumer-4098, console-consumer-14746, console-consumer-94575, cloud-dcb-host147.company.com, teacup_reporting_alex, console-consumer-4132, console-consumer-48171, ropod-dcb-host794.company.com, console-consumer-63743, console-consumer-36147, console-consumer-48138, console-consumer-33595, console-consumer-6808, console-consumer-31000, console-consumer- 73064, console-consumer-18050, console-consumer-21683, share-message, ropod-dcb-host959.company.com, ropod-dcb-host949.company.com, sensei-test_dcb_host138.company.com_1924844804, console-consumer-38654, console-consumer-92040, console-consumer-67052, console-consumer-82690, console-consumer-92002, console-consumer-69687, console-consumer-31077, console-consumer-94657, console-consumer-36064, console-consumer-45675, console-consumer-45671, console-consumer-70625, MemberSettings-dcx, console-consumer-55513, member- links-dcx, console-consumer-85367, opportunist-company, forum-queue, console-consumer-87912, console-consumer-75909, console-consumer-12320, sensei-test_user2_808173709, ropod-dcb- host937.company.com, console-consumer-8710, console-consumer-48390, python-kafka- consumer-816cebafabb34dd5be6bfce59cbee411, console-consumer-8701, console-consumer-6122, console- consumer-6142, metrics-dcb-monitor19, console-consumer-73329, console-consumer-87942, console- consumer-80552, console-consumer-48368, autometrics-dcb-host13, …!
  17. 17. Dealing with dead consumers •  For offsets older than offset retention period: – Append tombstone – Remove offset entry from cache
  18. 18. Recommended settings for offsets topic Replication factor >= 3 min.insync.replicas >= 2 unclean.leader.election.enable False offsets.commit.required.acks -1 (all)
  19. 19. How to commit/fetch offsets audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) __consumer_offsets-34: Leader: 2, ISR: 0, 1, 2 V I P Consumer metadata request Response (manager=2)
  20. 20. How to commit/fetch offsets audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) __consumer_offsets-34: Leader: 2, ISR: 0, 1, 2 Offset fetches Offset commits cache replication
  21. 21. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) __consumer_offsets-34: Leader: 2, ISR: 0, 1, 2 cache Become Leader load cache
  22. 22. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) __consumer_offsets-34: Leader: 2, ISR: 0, 1, 2 cache Become Leader load cache Become follower XXXXXX
  23. 23. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) Offset fetches Offset commits cache __consumer_offsets-34: Leader: 0, ISR: 0, 1, 2 cache X X
  24. 24. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) V I P Consumer metadata request cache __consumer_offsets-34: Leader: 0, ISR: 0, 1, 2 cache Response (manager=0)
  25. 25. When the offset manager moves audit-consumer Consumer instance Broker 0 Broker 1 Broker 2 Broker 3 (controller) cache __consumer_offsets-34: Leader: 0, ISR: 0, 1, 2 cache Offset commits Offset fetches replication
  26. 26. Offset{Commit,Fetch} API ConsumerMetadataRequest o Group Id: String ConsumerMetadataResponse o Error code: Short o Offset manager: Kafka broker info
  27. 27. Offset{Commit,Fetch} API OffsetCommitRequest o groupId: String o Offset map §  Key è Topic-partition §  Value è Partition-data •  Offset: Long •  Timestamp: Long •  Metadata: String KAFKA-1634: changes semantics of timestamp to retention
  28. 28. Offset{Commit,Fetch} API OffsetCommitResponse o Response map §  Key è Topic-partition §  Value è Error code
  29. 29. Offset{Commit,Fetch} API OffsetFetchRequest o Group Id: String o Partitions: List<Topic-partition> OffsetFetchResponse o Response map §  Key è Topic-partition §  Value è Partition-data •  Offset: Long •  Metadata: String •  Error code: Short
  30. 30. Offset{Commit,Fetch} API Code samples: http://bit.ly/1LTJBYo
  31. 31. Offset{Commit,Fetch} API KafkaConsumer<K, V> consumer = new KafkaConsumer<K, V>(properties);! …! TopicPartition partition1 = new TopicPartition("topic1", 0);! TopicPartition partition1 = new TopicPartition("topic1", 1);! ! consumer.subscribe(partition1, partition2);! ! Map<TopicPartition, Long> offsets = new LinkedHashMap<TopicPartition, Long>();! offsets.put(partition1, 123L);! offsets.put(partition2, 4320L);! …! // commit offsets! consumer.commit(offsets, CommitType.SYNC);! …! // fetch offsets! long committedOffset = consumer.committed(partition1);! !
  32. 32. How to read the offsets topic To read everything, use the console consumer! ./bin/kafka-console-consumer.sh --topic __consumer_offsets -- zookeeper localhost:2181 --formatter "kafka.server.OffsetManager $OffsetsMessageFormatter" --consumer.config config/ consumer.properties! (Must set exclude.internal.topics = false in consumer.properties) ! To read a single partition, use the simple- consumer-shell ./bin/kafka-simple-consumer-shell.sh --topic __consumer_offsets -- partition 12 --broker-list localhost:9092 --formatter "kafka.server.OffsetManager$OffsetsMessageFormatter"!
  33. 33. Inside the offsets topic [Group, Topic, Partition]::[Offset, Metadata, Timestamp] [audit-consumer,PageViewEvent,7]::OffsetAndMetadata[53568,NO_METADATA,1416363620711]! [audit-consumer,service-log-event,5]::OffsetAndMetadata[168012,NO_METADATA, 1416363620711]! [audit-consumer,EmailBounceEvent,4]::OffsetAndMetadata[8524676,NO_METADATA, 1416363620711]! [audit-consumer,ClickEvent,0]::OffsetAndMetadata[8132292,NO_METADATA,1416363620711]! [audit-consumer,metrics-event,1]::OffsetAndMetadata[1835900,NO_METADATA,1416363620711]! [audit-consumer,CompanyEvent,0]::OffsetAndMetadata[109337,NO_METADATA,1416363620711]! [audit-consumer,test-topic,1]::OffsetAndMetadata[352989,NO_METADATA,1416363620711]! [audit-consumer,meetup-event,2]::OffsetAndMetadata[39961,NO_METADATA,1416363620711]! [audit-consumer,push-topic,6]::OffsetAndMetadata[4210366,NO_METADATA,1416363620711]!
  34. 34. How to migrate/roll-back Migrate from ZooKeeper to Kafka: •  Config change – offsets.storage=kafka – dual.commit.enabled=true •  Rolling bounce •  Config change – dual.commit.enabled=false •  Rolling bounce
  35. 35. How to migrate/roll-back Migrate from Kafka to ZooKeeper: •  Config change – dual.commit.enabled=true •  Rolling bounce •  Config change – offsets.storage=zookeeper – dual.commit.enabled=false •  Rolling bounce
  36. 36. Key metrics to monitor •  Consumer mbeans –  Kafka commit rate –  ZooKeeper commit rate (during migration) •  Broker mbeans –  Max-dirty ratio and other log cleaner metrics –  Offset cache size –  Group count –  {ConsumerMetadata, OffsetCommit, OffsetFetch} request metrics
  37. 37. 0.8.3 •  Support compression in compacted topics (KAFKA-1734) •  Change offset commit “timestamp” to mean retention period: KAFKA-1634 •  Offset client
  38. 38. Monitor it!
  39. 39. Acknowledgments Kafka team @ LinkedIn Jay Kreps, Jun Rao, Neha Narkhede @ Confluent Tejas (2013 intern): http://lnkdin.me/p/tejaspatil1

×