5. Kafka is your glueāØ
to have diļ¬erent teams break down
complex end-to-end problems into smaller
more manageable onesāØ
āØ
like a ācheckpointā in a long hike/game/
drive
seperation of concerns helps you design distributed systems
15. ~300 million
requests/dayto our events kafka producer; ekaf
served by 22 cores
or 11 servers
running on auto-pilot
for 1+ years little maintenance, only scale serversāØ
zero-downtime support plenty of metrics
(started with 3 servers replacing 14 JVM producers)
separate auth logic āØ
from producer logic
erlang
= peaceful sleep
open source
(scalable)
22. kafka: āOK! each partition is set on
a diļ¬erent broker. one of the
brokers is an elected leader,āØ
push your messages to the leader
host, port please. if leader goes
down, other broker is elected.ā
23. is it the fault-tolerance
you get?
NO letās take more wild guesses at the secret sauce
You: ācreate topic fooā āØ
with 3 partitions
partitionsāØ
leadersāØ
electionāØ
āØ
25. kafka: please dial 1-800-metadata āØ
to any broker. we all maintain this
data. you can make a metadata
request with topic(s) you want, and
we will return it s partitions, and
their broker hosts, ports, leader
info, etc
26. is it the separation of
concerns?
NO letās take more wild guesses at the secret sauce
You: āhey broker1, where should i push to?ā
metadata on all brokersāØ
āØ
gives host & port of leader āØ
for partitions + more info
28. kafka: āthank you for calling partition1. āØ
i will append it to a dir/ļ¬le called topic1/
partition1.log. āØ
āØ
Since its append only like a commit log, āØ
you get ordering within partition free!ā
29. is it the producer
speed you get?
NO letās take more wild guesses at the secret sauce
You: āhey partition1 on broker1, for topic1āØ
here are 10 messages.ā
append only commit-logāØ
ordering within partitionāØ
make 1 partition for global orderingāØ
31. kafka: āthank you for connecting to
the right broker. you can now read any
partition you want, and from any
oļ¬set. All i care about is which oļ¬set
to start reading from in a topic/
partition.log ļ¬leā
32. You: āhey broker1, topic1, partition1āØ
i want all data from message oļ¬setāØ
ā¦.10(from ZK) onwardsā
33. kafka: āthank you for calling partition1. āØ
iām going to sendļ¬le the bytes you asked
from kernel-space directly to the socket
of a consumer using zero copy, thus
reducing context switches & minimal
garbage collection. yes, i am badass.ā
34. is it the consumer
speed you get?
NO letās take more wild guesses at the secret sauce
You: āhey broker1, topic1, partition1āØ
i want all data from message offsetāØ
ā¦.10(from ZK) onwardsā
oļ¬set bytesāØ
sendļ¬le bytes from oļ¬set to socketāØ
kernel space not userspaceāØ
35. You: āfor topic1,āØ
i want 3 consumers all reading every msg. āØ
āØ
for topic2, āØ
i want the data split between 3 consumersā
36. kafka: ā3 diļ¬erent pipelines/actions āØ
on the same input topic1? Nice! āØ
I can see your team is growing.āØ
āØ
Thank you for grokking the concept of
consumer groups topic2. Make sure all
3 of your use the same group-id, and iāll
take of the rest!ā
37. is it the consumer
parallelism?
NO letās take more wild guesses at the secret sauce
You: āfor topic1, i want 3 consumers all reading every msg.āØ
for topic2, i want the data split between 3 consumersā
can broadcast to all consumersāØ
can split b/w a group of consumers
38. binary protocol.
but importantly āØ
a documented spec ofāØ
what goes/comes over-the-wire
the one thing particularly
distinguishes kafka & gives it a
stellar status in the ecosystem is
its
39. the creators of kafka built the brokers & the specāØ
of how to communicate with it (producers & consumers), āØ
and let the community speak the protocol
a documented spec ofāØ
what goes/comes over-the-wire
40. the creators of kafka built the brokers & the spec
a documented spec ofāØ
what goes/comes over-the-wire
eg: do you know what is sent āØ
by a namenode/jobtracker/datanode over the wire?āØ
(PS: where is the spec? come meet me after)
41. protocols win over apiās/drivers
the creators of kafka focussed on the spec
If you knew what data a namenode/jobtracker/datanode actually communicates over
the wire. It opens up a new world.
allows diļ¬ language clients to express
themselves best āØ
āØ
its just data over TCP sockets.āØ
more freedomāØ
āØ
more integrations & faster adoption
47. the joy of knowing how
your data is encoded and
sent over tcp socket
keeps things simple, lets you sleep betterāØ
easier to debug, test, add middle-wares to audit, etc
MQTT endoded messages crypto encoded
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
thrift/avro encoded messages
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
AOL/YAHOO/* packets
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
48. brokers focus on the
commit logs, etc
and delegate several opinionated areasāØ
to the client, onus is on the client to make smart decisions
compression queueāing
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
detect downtime
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
load balancing to partitions
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
49. clients
keeps things simpleāØ
a subset of what clients need to do include:
1. bootstrap
2. open sockets
3. encode packets
4. decode packets
5. send over tcp
6. route responses
7. handle failures, events
8. state machines
50. distributed systems āØ
need to be built āØ
as state machines
donāt trust a client that blocksāØ
until an operation is complete
51. distributed systems āØ
need to be built āØ
as state machines
// sendāØ
// wait for responseāØ
āØ
BADāØ
even if this thread/processāØ
is doing nothing until timeoutāØ
the idling is not eļ¬cient
52. distributed systems āØ
need to be built āØ
as state machines
// listening for socket statesāØ
// listening for responses āØ
// send, and continueāØ
// on_recv, route responseāØ
// after_timeout, route timeoutāØ
āØ
GOOD
53. distributed systems āØ
need to be built āØ
as state machines
set concurrency options for
ekafāØ
set the hostname of a load
balancer over your brokers
http://github.com/helpshift/ekaf
54. distributed systems āØ
need to be built āØ
as state machines
ekaf hits the ground
running with 1 call
// publish(topic, message)
āØ
if no state machine, it flows
from āØ
request metadata ->
worker pool creation āØ
-> socket connecting ->
ready state
http://github.com/helpshift/ekaf
55. distributed systems āØ
need to be built āØ
as state machines
if topic state machine has
metadataāØ
it knows which broker for
each partitionāØ
if state already has socket,
queue it
http://github.com/helpshift/ekaf
56. distributed systems āØ
need to be built āØ
as state machines
all messages in states
before ready areāØ
queued.
if queue hits size āØ
OR āØ
hits flush timeout. send it
http://github.com/helpshift/ekaf
62. ekaf @Layer (ex-Apple engineers) āØ
āThe art of powering the Internetās next messaging systemā
https://www.youtube.com/watch?
v=mv2MBYU8Yls#t=33m5s
63. ekaf@ a chinese social networkāØ
1 pull request about to be merged
and elsewhere
73. ā¢ PG and multiple bolts tip
ā¢ Metrics at every bolt
ā¢ Local statsite -> grafana
ā¢ Avoid metric explosionāØ
instrumenting tips
74. [WIP] segment population
query
ESāØ
āØ
job āØ
tracking
#clj-kafka
kafka consumer / storm
elasticsearch āØ
queryāØ
representingāØ
segment
scheduler
S3
countsāØ
in PG
āļ¬nd users who match level 2 , who did not ļ¬nd the easter eggā
75. ļ¬t your use caseāØ
population count
moving average
a note on samzaās state
76. Kafka is your glueāØ
to have diļ¬erent teams break down
complex end-to-end problems into smaller
more manageable onesāØ
āØ
like a ācheckpointā in a long hike/game/
drive
seperation of concerns helps you design distributed systems
78. Small Snapshot of Helpshift
Hay Day Boom Beach Clash of Clans Deer Hunter High School
Story
Family GuyFlipboard Circa Wordpress Misļ¬t Microsoft
Outlook
APP + API
DB
MONITORING
OTHER
ROUTING
HAProxy
Our SDK is being embedded in a growing list of popular apps