SlideShare a Scribd company logo
1 of 79
Download to read offline
#kafkaā€Ø
#storm
@FifthEl ā€˜15
Iā€™m @bhaskerkode aka Boskyā€Ø
Product Engg at Helpshift
youā€™re here at a ā€Ø
bigdata, ā€Ø
& analytics
conference
to ā€¦
mobile
web
numbers
insights
predict
iot
internal
?
this talk
mobile
web
numbers
insights
predict
iot
internal
your jobā€Ø
ā€Ø
your raise
?
this talk really is about sleep
goodā€Ø
sleep
Kafka is your glueā€Ø
to have diļ¬€erent teams break down
complex end-to-end problems into smaller
more manageable onesā€Ø
ā€Ø
like a ā€œcheckpointā€ in a long hike/game/
drive
seperation of concerns helps you design distributed systems
kafka & stream
processing
power combo <insert quote to impress friend>
developedā€Ø
world
wiredā€Ø
communication
wireless / ā€Ø
mobile
developingā€Ø
world
wireless / ā€Ø
mobile
the same may happen
with bigdata & analytics
dataā€Ø
ingestion
ofļ¬‚ine ā€Ø
processing
streamā€Ø
processing
dataā€Ø
ingestion
streamā€Ø
processing
Linkedin
http://kafka.apache.org/
Twitter
https://storm.apache.org/
mobile
web
HA
iot
internal
biz ā€Ø
logic
typical analytics ingestion architecture
dbā€Ø
cacheā€Ø
searchā€Ø
infra
mo
web
HA
iot
inte
auth logic
task #1
kafkaā€Ø
producer
dbā€Ø
cacheā€Ø
searchā€Ø
infra
kafkaā€Ø
consumer
consumer storm ā€Ø
spark samza
how kafka helps meā€Ø
sleep better
how? letā€™s talk numbers
~300 million
requests/dayto our events kafka producer; ekaf
served by 22 cores
or 11 servers
running on auto-pilot
for 1+ years little maintenance, only scale serversā€Ø
zero-downtime support plenty of metrics
(started with 3 servers replacing 14 JVM producers)
separate auth logic ā€Ø
from producer logic
erlang
= peaceful sleep
open source
(scalable)
mo
web
iot
inte
modular design
kafboy
ekaf
auth logic
HTTP 500ā€Ø
unauthorizedā€Ø
wrong apiā€™s
metrics
HTTP 200
modular design
kafboy
ekaf
auth logic
HTTP 500
unauthorized
wrong apiā€™s
metrics
HTTP 200
one thing particularly
distinguishes kafka
secret sauce? letā€™s take wild guesses
community?
NO letā€™s take more wild guesses at the secret sauce
letā€™s illustrate with some examples
You: ā€œCreate topic foo with 3 partitionsā€
kafka: ā€œOK! each partition is set on
a diļ¬€erent broker. one of the
brokers is an elected leader,ā€Ø
push your messages to the leader
host, port please. if leader goes
down, other broker is elected.ā€
is it the fault-tolerance
you get?
NO letā€™s take more wild guesses at the secret sauce
You: ā€œcreate topic fooā€ ā€Ø
with 3 partitions
partitionsā€Ø
leadersā€Ø
electionā€Ø
ā€Ø
You: ā€œhey broker1, where should i push to?ā€
kafka: please dial 1-800-metadata ā€Ø
to any broker. we all maintain this
data. you can make a metadata
request with topic(s) you want, and
we will return it s partitions, and
their broker hosts, ports, leader
info, etc
is it the separation of
concerns?
NO letā€™s take more wild guesses at the secret sauce
You: ā€œhey broker1, where should i push to?ā€
metadata on all brokersā€Ø
ā€Ø
gives host & port of leader ā€Ø
for partitions + more info
You: ā€œhey partition1 on broker1, for topic1ā€Ø
here are 10 messages.ā€
kafka: ā€œthank you for calling partition1. ā€Ø
i will append it to a dir/ļ¬le called topic1/
partition1.log. ā€Ø
ā€Ø
Since its append only like a commit log, ā€Ø
you get ordering within partition free!ā€
is it the producer
speed you get?
NO letā€™s take more wild guesses at the secret sauce
You: ā€œhey partition1 on broker1, for topic1ā€Ø
here are 10 messages.ā€
append only commit-logā€Ø
ordering within partitionā€Ø
make 1 partition for global orderingā€Ø
You: ā€œhey broker2, where should i consume from?
kafka: ā€œthank you for connecting to
the right broker. you can now read any
partition you want, and from any
oļ¬€set. All i care about is which oļ¬€set
to start reading from in a topic/
partition.log ļ¬leā€
You: ā€œhey broker1, topic1, partition1ā€Ø
i want all data from message oļ¬€setā€Ø
ā€¦.10(from ZK) onwardsā€
kafka: ā€œthank you for calling partition1. ā€Ø
iā€™m going to sendļ¬le the bytes you asked
from kernel-space directly to the socket
of a consumer using zero copy, thus
reducing context switches & minimal
garbage collection. yes, i am badass.ā€
is it the consumer
speed you get?
NO letā€™s take more wild guesses at the secret sauce
You: ā€œhey broker1, topic1, partition1ā€Ø
i want all data from message offsetā€Ø
ā€¦.10(from ZK) onwardsā€
oļ¬€set bytesā€Ø
sendļ¬le bytes from oļ¬€set to socketā€Ø
kernel space not userspaceā€Ø
You: ā€œfor topic1,ā€Ø
i want 3 consumers all reading every msg. ā€Ø
ā€Ø
for topic2, ā€Ø
i want the data split between 3 consumersā€
kafka: ā€œ3 diļ¬€erent pipelines/actions ā€Ø
on the same input topic1? Nice! ā€Ø
I can see your team is growing.ā€Ø
ā€Ø
Thank you for grokking the concept of
consumer groups topic2. Make sure all
3 of your use the same group-id, and iā€™ll
take of the rest!ā€
is it the consumer
parallelism?
NO letā€™s take more wild guesses at the secret sauce
You: ā€œfor topic1, i want 3 consumers all reading every msg.ā€Ø
for topic2, i want the data split between 3 consumersā€
can broadcast to all consumersā€Ø
can split b/w a group of consumers
binary protocol.
but importantly ā€Ø
a documented spec ofā€Ø
what goes/comes over-the-wire
the one thing particularly
distinguishes kafka & gives it a
stellar status in the ecosystem is
its
the creators of kafka built the brokers & the specā€Ø
of how to communicate with it (producers & consumers), ā€Ø
and let the community speak the protocol
a documented spec ofā€Ø
what goes/comes over-the-wire
the creators of kafka built the brokers & the spec
a documented spec ofā€Ø
what goes/comes over-the-wire
eg: do you know what is sent ā€Ø
by a namenode/jobtracker/datanode over the wire?ā€Ø
(PS: where is the spec? come meet me after)
protocols win over apiā€™s/drivers
the creators of kafka focussed on the spec
If you knew what data a namenode/jobtracker/datanode actually communicates over
the wire. It opens up a new world.
allows diļ¬€ language clients to express
themselves best ā€Ø
ā€Ø
its just data over TCP sockets.ā€Ø
more freedomā€Ø
ā€Ø
more integrations & faster adoption
000300000000000ā€Ø
10007636c69656e7ā€Ø
4310000000100066ā€Ø
576656e7473
0,3,0,0,0,0,0,1,
0,7,99,108,ā€Ø
105,101,110,116,
49,0,0,0,1,0,
6,101,118,101,11
0,116,115
1. open a tcp socket to any kafka 0.8+ broker:port
2. send these 29 bytes that asks for metadata for topic
ā€œeventsā€
these 29 bytes
(or in hex)
and youā€™llā€Ø
get backā€Ø
metadataā€Ø
for the topicā€Ø
ā€œeventsā€ā€Ø
ā€Ø
always.
0,3,ā€Ø
0,0,ā€Ø
0,0,0,1,ā€Ø
0,7,ā€Ø
99,108,105,101,110,116,49,ā€Ø
0,0,0,1,ā€Ø
0,6,ā€Ø
101,118,101,110,116,115
[2 bytes] metadata code = 3
[2 bytes] api version = 0ā€Ø
[4 bytes] int id (for replies)ā€Ø
[2 bytes] client id length = 7
[7 bytes] ā€œclient1ā€ā€Ø
[4 bytes] no# of topics = 1 ā€Ø
[2 bytes] topic[0] length = 6ā€Ø
[6 bytes] ā€œeventsā€ā€Ø
what this packet means
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
% 0,0,0,1, % this is a response to req id 1
% 0,0,0,2, % number of brokersā€Ø
% 0,0,0,1, % broker[0] idā€Ø
% then broker name, host, portā€Ø
ā€Ø
% 0,0,0,3, % topics len
% 0,0,0,6, % topic[0] name len
% ā€¦ā€¦ā€¦ā€¦ā€¦ā€¦.. % topic[0] name events
% 0,0,0,2, % topic[0] partitions len
% 0,0, % topic[0] partition1 error code
% 0,0,0,0, % topic[0] partition1
% 0,0,0,1 % topic[0] partition1 leaderid
% 0,0,0,1, % topic[0] partition1 replicas len
% 0,0,0,3, % topic[0] partition1 replica1
% 0,0,0,1, % topic[0] partition1 isr len
% 0,0,0,3, % topic[0] partition1 isr1
% another partition data here
% etc
metadataā€Ø
responseā€Ø
decoded
similiarly all
other operations
encode the request as a packet ā€Ø
send over tcp socket
sync produce handle response
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
async produce
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
no response
MessageSet => [Offset MessageSize Message]
Offset => int64
MessageSize => int32
Messagesā€Ø
Message => Crc MagicByte Attributes Key
Value
Crc => int32
MagicByte => int8
Attributes => int8
Key => bytes
Value => bytes
the joy of knowing how
your data is encoded and
sent over tcp socket
keeps things simple, lets you sleep betterā€Ø
easier to debug, test, add middle-wares to audit, etc
MQTT endoded messages crypto encoded
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
thrift/avro encoded messages
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
AOL/YAHOO/* packets
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
brokers focus on the
commit logs, etc
and delegate several opinionated areasā€Ø
to the client, onus is on the client to make smart decisions
compression queueā€™ing
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
detect downtime
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
load balancing to partitions
<<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116,
45,117,98,117,110,116,117,45,112,114,101,99,105,115,101,
45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110,
116,45,117,98,117,110,116,117,45,112,114,101,99,105,115,
101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97,
110,116,45,117,98,117,110,116,117,45,112,114,101,99,105,
115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0,
2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0,
0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,
2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,
0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1,
0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0,
2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0,
0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
clients
keeps things simpleā€Ø
a subset of what clients need to do include:
1. bootstrap
2. open sockets
3. encode packets
4. decode packets
5. send over tcp
6. route responses
7. handle failures, events
8. state machines
distributed systems ā€Ø
need to be built ā€Ø
as state machines
donā€™t trust a client that blocksā€Ø
until an operation is complete
distributed systems ā€Ø
need to be built ā€Ø
as state machines
// sendā€Ø
// wait for responseā€Ø
ā€Ø
BADā€Ø
even if this thread/processā€Ø
is doing nothing until timeoutā€Ø
the idling is not eļ¬ƒcient
distributed systems ā€Ø
need to be built ā€Ø
as state machines
// listening for socket statesā€Ø
// listening for responses ā€Ø
// send, and continueā€Ø
// on_recv, route responseā€Ø
// after_timeout, route timeoutā€Ø
ā€Ø
GOOD
distributed systems ā€Ø
need to be built ā€Ø
as state machines
set concurrency options for
ekafā€Ø
set the hostname of a load
balancer over your brokers
http://github.com/helpshift/ekaf
distributed systems ā€Ø
need to be built ā€Ø
as state machines
ekaf hits the ground
running with 1 call
// publish(topic, message)
ā€Ø
if no state machine, it flows
from ā€Ø
request metadata ->
worker pool creation ā€Ø
-> socket connecting ->
ready state
http://github.com/helpshift/ekaf
distributed systems ā€Ø
need to be built ā€Ø
as state machines
if topic state machine has
metadataā€Ø
it knows which broker for
each partitionā€Ø
if state already has socket,
queue it
http://github.com/helpshift/ekaf
distributed systems ā€Ø
need to be built ā€Ø
as state machines
all messages in states
before ready areā€Ø
queued.
if queue hits size ā€Ø
OR ā€Ø
hits flush timeout. send it
http://github.com/helpshift/ekaf
http://github.com/helpshift/ekafGo through docs at
tests include broker downtime , adding a
broker, etc & a mini kafka broker for tests
http://github.com/helpshift/kafkamocker
flushed queue
worker up
worker down
downtime saved
downtime replayed
time to connect
max downtime q
Callbacks ā€Ø
used for Metrics
last year 6k/minā€Ø
now 6k/secā€Ø
ā€Ø
only scaled up ā€Ø
servers
hello worldā€Ø
driven ā€Ø
development
response time ā€Ø
against endpointā€Ø
to just echo
ekaf @Layer (ex-Apple engineers) ā€Ø
ā€œThe art of powering the Internetā€™s next messaging systemā€
https://www.youtube.com/watch?
v=mv2MBYU8Yls#t=33m5s
ekaf@ a chinese social networkā€Ø
1 pull request about to be merged
and elsewhere
back to pipelines
involving a kafka producer and consumer
SDKā€Ø nginx
auth/apiā€Ø
#erlang
active user analytics pipeline @helpshift
PGā€Ø
ā€Ø
S3
kafkaā€Ø
httpā€Ø
producerā€Ø
#kafboy (uses ekaf)
to diskā€Ø
hyperloglogā€Ø
countsā€Ø
#clojure
kafka consumerā€Ø
#clj-kafka
~1 billionā€Ø
devices
HA
EMRā€Ø
(internal jobs)
(dashboards)
mail delivery @helpshift
actually
sent
#clj-kafka
kafka consumerā€Ø
#clj-kafka
email
[WIP] ES indexing @helpshift
actually
indexed
#clj-kafka
ES bulk index
docs
audit/action trails @helpshift
PG
#clj-kafka
kafka consumerā€Ø
#clj-kafka
old object
new object
diff
emit/ignoreā€Ø
rows
few rules
objects are ā€Ø
namespace
must have id
Storm
@helpshift
iTunesā€Ø
ā€Ø
Play
reviewsā€Ø
distributedā€Ø
crawler
#goā€Ø
#masterā€Ø
#worker farmā€Ø
#controller
the reviews storm pipeline @helpshift
PG
kafkaā€Ø
producerā€Ø
#shopify/sarama
deduplicationā€Ø
tokenizationā€Ø
topic extraction
sentiment analysisā€Ø
stormā€Ø
kafka spout
example storm topologyā€Ø
read up more on spouts and bolts (any Qā€™s?)
ā€¢ PG and multiple bolts tip
ā€¢ Metrics at every bolt
ā€¢ Local statsite -> grafana
ā€¢ Avoid metric explosionā€Ø
instrumenting tips
[WIP] segment population
query
ESā€Ø
ā€Ø
job ā€Ø
tracking
#clj-kafka
kafka consumer / storm
elasticsearch ā€Ø
queryā€Ø
representingā€Ø
segment
scheduler
S3
countsā€Ø
in PG
ā€œļ¬nd users who match level 2 , who did not ļ¬nd the easter eggā€
ļ¬t your use caseā€Ø
population count
moving average
a note on samzaā€™s state
Kafka is your glueā€Ø
to have diļ¬€erent teams break down
complex end-to-end problems into smaller
more manageable onesā€Ø
ā€Ø
like a ā€œcheckpointā€ in a long hike/game/
drive
seperation of concerns helps you design distributed systems
numbers
your job
good sleep
your product
kafka + storm
Small Snapshot of Helpshift
Hay Day Boom Beach Clash of Clans Deer Hunter High School
Story
Family GuyFlipboard Circa Wordpress Misļ¬t Microsoft
Outlook
APP + API
DB
MONITORING
OTHER
ROUTING
HAProxy
Our SDK is being embedded in a growing list of popular apps
#kafkaā€Ø
#storm
Thanks!
Iā€™m @bhaskerkode aka Boskyā€Ø
Product Engg @Helpshift
bosky@helpshift.com
Find this talk atā€Ø
http://bit.ly/ļ¬fthel15-kafka-storm

More Related Content

What's hot

HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017
HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017
HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017Codemotion
Ā 
0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with ErlangMaxim Kharchenko
Ā 
Open MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFOpen MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFJeff Squyres
Ā 
Bh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackopsBh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackopsDan Kaminsky
Ā 
Concurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple SpacesConcurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple Spacesluccastera
Ā 
VCS for Teamwork - GIT Workshop
VCS for Teamwork - GIT WorkshopVCS for Teamwork - GIT Workshop
VCS for Teamwork - GIT WorkshopAnis Ahmad
Ā 
0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with ErlangMaxim Kharchenko
Ā 
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Kafka Summit NYC 2017 - Deep Dive Into Apache KafkaKafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafkaconfluent
Ā 
Bh eu 05-kaminsky
Bh eu 05-kaminskyBh eu 05-kaminsky
Bh eu 05-kaminskyDan Kaminsky
Ā 
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...Igalia
Ā 
WTF is Twisted?
WTF is Twisted?WTF is Twisted?
WTF is Twisted?hawkowl
Ā 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Chris Fregly
Ā 
Intro to Erlang
Intro to ErlangIntro to Erlang
Intro to ErlangKen Pratt
Ā 
Automate_LSF_ppt_final
Automate_LSF_ppt_finalAutomate_LSF_ppt_final
Automate_LSF_ppt_finalSumit Ghosh
Ā 
Packaging perl (LPW2010)
Packaging perl (LPW2010)Packaging perl (LPW2010)
Packaging perl (LPW2010)p3castro
Ā 
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...Ambassador Labs
Ā 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey J On The Beach
Ā 
Concurrency in Python
Concurrency in PythonConcurrency in Python
Concurrency in PythonGavin Roy
Ā 
Memory Management In Python The Basics
Memory Management In Python The BasicsMemory Management In Python The Basics
Memory Management In Python The BasicsNina Zakharenko
Ā 
An Introduction to Twisted
An Introduction to TwistedAn Introduction to Twisted
An Introduction to Twistedsdsern
Ā 

What's hot (20)

HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017
HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017
HTTP2 in action - Piet Van Dongen - Codemotion Amsterdam 2017
Ā 
0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with Erlang
Ā 
Open MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFOpen MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOF
Ā 
Bh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackopsBh us-02-kaminsky-blackops
Bh us-02-kaminsky-blackops
Ā 
Concurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple SpacesConcurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple Spaces
Ā 
VCS for Teamwork - GIT Workshop
VCS for Teamwork - GIT WorkshopVCS for Teamwork - GIT Workshop
VCS for Teamwork - GIT Workshop
Ā 
0.5mln packets per second with Erlang
0.5mln packets per second with Erlang0.5mln packets per second with Erlang
0.5mln packets per second with Erlang
Ā 
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Kafka Summit NYC 2017 - Deep Dive Into Apache KafkaKafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Ā 
Bh eu 05-kaminsky
Bh eu 05-kaminskyBh eu 05-kaminsky
Bh eu 05-kaminsky
Ā 
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Ā 
WTF is Twisted?
WTF is Twisted?WTF is Twisted?
WTF is Twisted?
Ā 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Ā 
Intro to Erlang
Intro to ErlangIntro to Erlang
Intro to Erlang
Ā 
Automate_LSF_ppt_final
Automate_LSF_ppt_finalAutomate_LSF_ppt_final
Automate_LSF_ppt_final
Ā 
Packaging perl (LPW2010)
Packaging perl (LPW2010)Packaging perl (LPW2010)
Packaging perl (LPW2010)
Ā 
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
Ā 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
Ā 
Concurrency in Python
Concurrency in PythonConcurrency in Python
Concurrency in Python
Ā 
Memory Management In Python The Basics
Memory Management In Python The BasicsMemory Management In Python The Basics
Memory Management In Python The Basics
Ā 
An Introduction to Twisted
An Introduction to TwistedAn Introduction to Twisted
An Introduction to Twisted
Ā 

Viewers also liked

Parsing binaries and protocols with erlang
Parsing binaries and protocols with erlangParsing binaries and protocols with erlang
Parsing binaries and protocols with erlangBhasker Kode
Ā 
in-memory capacity planning, Erlang Factory London 09
in-memory capacity planning, Erlang Factory London 09in-memory capacity planning, Erlang Factory London 09
in-memory capacity planning, Erlang Factory London 09Bhasker Kode
Ā 
end user programming & yahoo pipes
end user programming & yahoo pipesend user programming & yahoo pipes
end user programming & yahoo pipesBhasker Kode
Ā 
Functional Programing
Functional ProgramingFunctional Programing
Functional ProgramingMax Arshinov
Ā 
QCON SP 2016 - Elixir: TolerĆ¢ncia a Falhas para Adultos
QCON SP 2016 - Elixir: TolerĆ¢ncia a Falhas para AdultosQCON SP 2016 - Elixir: TolerĆ¢ncia a Falhas para Adultos
QCON SP 2016 - Elixir: TolerĆ¢ncia a Falhas para AdultosFabio Akita
Ā 
There Are Literally Thousands of Erlang Projects
There Are Literally Thousands of Erlang ProjectsThere Are Literally Thousands of Erlang Projects
There Are Literally Thousands of Erlang ProjectsPierre Fenoll
Ā 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsYinghai Lu
Ā 
Elixir - Easy fun for busy developers @ Devoxx 2016
Elixir - Easy fun for busy developers @ Devoxx 2016Elixir - Easy fun for busy developers @ Devoxx 2016
Elixir - Easy fun for busy developers @ Devoxx 2016David Schmitz
Ā 
Erlang Supervision Trees
Erlang Supervision TreesErlang Supervision Trees
Erlang Supervision TreesDigikrit
Ā 
The Erlang Programming Language
The Erlang Programming LanguageThe Erlang Programming Language
The Erlang Programming LanguageDennis Byrne
Ā 
Erlang containers
Erlang containersErlang containers
Erlang containersSargun Dhillon
Ā 
FunctionalConf '16 Robert Virding Erlang Ecosystem
FunctionalConf '16 Robert Virding Erlang EcosystemFunctionalConf '16 Robert Virding Erlang Ecosystem
FunctionalConf '16 Robert Virding Erlang EcosystemRobert Virding
Ā 
Repeating History...On Purpose...with Elixir
Repeating History...On Purpose...with ElixirRepeating History...On Purpose...with Elixir
Repeating History...On Purpose...with ElixirBarry Jones
Ā 
Erlang Šø n2o. Web-рŠ°Š·Ń€Š°Š±Š¾Ń‚ŠŗŠ° Š±ŠµŠ· JavaScript
Erlang Šø n2o. Web-рŠ°Š·Ń€Š°Š±Š¾Ń‚ŠŗŠ° Š±ŠµŠ· JavaScriptErlang Šø n2o. Web-рŠ°Š·Ń€Š°Š±Š¾Ń‚ŠŗŠ° Š±ŠµŠ· JavaScript
Erlang Šø n2o. Web-рŠ°Š·Ń€Š°Š±Š¾Ń‚ŠŗŠ° Š±ŠµŠ· JavaScriptEugene Tataurov
Ā 
RabbitMQ: Message queuing that works
RabbitMQ: Message queuing that worksRabbitMQ: Message queuing that works
RabbitMQ: Message queuing that worksCodemotion
Ā 
Rethink programming: a functional approach
Rethink programming: a functional approachRethink programming: a functional approach
Rethink programming: a functional approachFrancesco Bruni
Ā 
SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"
SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"
SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"Inhacking
Ā 
Erlang and Elixir
Erlang and ElixirErlang and Elixir
Erlang and Elixirhayabusa333
Ā 

Viewers also liked (20)

Parsing binaries and protocols with erlang
Parsing binaries and protocols with erlangParsing binaries and protocols with erlang
Parsing binaries and protocols with erlang
Ā 
in-memory capacity planning, Erlang Factory London 09
in-memory capacity planning, Erlang Factory London 09in-memory capacity planning, Erlang Factory London 09
in-memory capacity planning, Erlang Factory London 09
Ā 
end user programming & yahoo pipes
end user programming & yahoo pipesend user programming & yahoo pipes
end user programming & yahoo pipes
Ā 
Beam me up, Scotty
Beam me up, ScottyBeam me up, Scotty
Beam me up, Scotty
Ā 
Functional Programing
Functional ProgramingFunctional Programing
Functional Programing
Ā 
QCON SP 2016 - Elixir: TolerĆ¢ncia a Falhas para Adultos
QCON SP 2016 - Elixir: TolerĆ¢ncia a Falhas para AdultosQCON SP 2016 - Elixir: TolerĆ¢ncia a Falhas para Adultos
QCON SP 2016 - Elixir: TolerĆ¢ncia a Falhas para Adultos
Ā 
There Are Literally Thousands of Erlang Projects
There Are Literally Thousands of Erlang ProjectsThere Are Literally Thousands of Erlang Projects
There Are Literally Thousands of Erlang Projects
Ā 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
Ā 
Elixir - Easy fun for busy developers @ Devoxx 2016
Elixir - Easy fun for busy developers @ Devoxx 2016Elixir - Easy fun for busy developers @ Devoxx 2016
Elixir - Easy fun for busy developers @ Devoxx 2016
Ā 
Erlang Supervision Trees
Erlang Supervision TreesErlang Supervision Trees
Erlang Supervision Trees
Ā 
The Erlang Programming Language
The Erlang Programming LanguageThe Erlang Programming Language
The Erlang Programming Language
Ā 
Erlang containers
Erlang containersErlang containers
Erlang containers
Ā 
FunctionalConf '16 Robert Virding Erlang Ecosystem
FunctionalConf '16 Robert Virding Erlang EcosystemFunctionalConf '16 Robert Virding Erlang Ecosystem
FunctionalConf '16 Robert Virding Erlang Ecosystem
Ā 
Elixir intro
Elixir introElixir intro
Elixir intro
Ā 
Repeating History...On Purpose...with Elixir
Repeating History...On Purpose...with ElixirRepeating History...On Purpose...with Elixir
Repeating History...On Purpose...with Elixir
Ā 
Erlang Šø n2o. Web-рŠ°Š·Ń€Š°Š±Š¾Ń‚ŠŗŠ° Š±ŠµŠ· JavaScript
Erlang Šø n2o. Web-рŠ°Š·Ń€Š°Š±Š¾Ń‚ŠŗŠ° Š±ŠµŠ· JavaScriptErlang Šø n2o. Web-рŠ°Š·Ń€Š°Š±Š¾Ń‚ŠŗŠ° Š±ŠµŠ· JavaScript
Erlang Šø n2o. Web-рŠ°Š·Ń€Š°Š±Š¾Ń‚ŠŗŠ° Š±ŠµŠ· JavaScript
Ā 
RabbitMQ: Message queuing that works
RabbitMQ: Message queuing that worksRabbitMQ: Message queuing that works
RabbitMQ: Message queuing that works
Ā 
Rethink programming: a functional approach
Rethink programming: a functional approachRethink programming: a functional approach
Rethink programming: a functional approach
Ā 
SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"
SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"
SE2016 Exotic Valerii Vasylkov "Erlang. Measurements and benefits"
Ā 
Erlang and Elixir
Erlang and ElixirErlang and Elixir
Erlang and Elixir
Ā 

Similar to Kafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift

Of the variedtypes of IPC, sockets arout and awaythe foremostcommon..pdf
Of the variedtypes of IPC, sockets arout and awaythe foremostcommon..pdfOf the variedtypes of IPC, sockets arout and awaythe foremostcommon..pdf
Of the variedtypes of IPC, sockets arout and awaythe foremostcommon..pdfanuradhasilks
Ā 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandraaaronmorton
Ā 
Apache Kafka Women Who Code Meetup
Apache Kafka Women Who Code MeetupApache Kafka Women Who Code Meetup
Apache Kafka Women Who Code MeetupSnehal Nagmote
Ā 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...HostedbyConfluent
Ā 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
Ā 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
Ā 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
Ā 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaJoe Stein
Ā 
101 ways to configure kafka - badly
101 ways to configure kafka - badly101 ways to configure kafka - badly
101 ways to configure kafka - badlyHenning Spjelkavik
Ā 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to heroAvi Levi
Ā 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka TLV
Ā 
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...Natan Silnitsky
Ā 
Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...javier ramirez
Ā 
From A to Z | WireShark Tutorial
From A to Z | WireShark TutorialFrom A to Z | WireShark Tutorial
From A to Z | WireShark TutorialTurkHackTeam EDU
Ā 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime StreamingViyaan Jhiingade
Ā 
101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)Henning Spjelkavik
Ā 
Kafka Deep Dive
Kafka Deep DiveKafka Deep Dive
Kafka Deep DiveKnoldus Inc.
Ā 
Using RAG to create your own Podcast conversations.pdf
Using RAG to create your own Podcast conversations.pdfUsing RAG to create your own Podcast conversations.pdf
Using RAG to create your own Podcast conversations.pdfRichard Rodger
Ā 

Similar to Kafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift (20)

Of the variedtypes of IPC, sockets arout and awaythe foremostcommon..pdf
Of the variedtypes of IPC, sockets arout and awaythe foremostcommon..pdfOf the variedtypes of IPC, sockets arout and awaythe foremostcommon..pdf
Of the variedtypes of IPC, sockets arout and awaythe foremostcommon..pdf
Ā 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
Ā 
Apache Kafka Women Who Code Meetup
Apache Kafka Women Who Code MeetupApache Kafka Women Who Code Meetup
Apache Kafka Women Who Code Meetup
Ā 
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Ā 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
Ā 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
Ā 
Computer Security
Computer SecurityComputer Security
Computer Security
Ā 
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Ā 
Developing Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache KafkaDeveloping Realtime Data Pipelines With Apache Kafka
Developing Realtime Data Pipelines With Apache Kafka
Ā 
101 ways to configure kafka - badly
101 ways to configure kafka - badly101 ways to configure kafka - badly
101 ways to configure kafka - badly
Ā 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to hero
Ā 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to hero
Ā 
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
8 Lessons Learned from Using Kafka in 1500 microservices - confluent streamin...
Ā 
Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...Everything you always wanted to know about Distributed databases, at devoxx l...
Everything you always wanted to know about Distributed databases, at devoxx l...
Ā 
From A to Z | WireShark Tutorial
From A to Z | WireShark TutorialFrom A to Z | WireShark Tutorial
From A to Z | WireShark Tutorial
Ā 
Kafka 101
Kafka 101Kafka 101
Kafka 101
Ā 
Kafka RealTime Streaming
Kafka RealTime StreamingKafka RealTime Streaming
Kafka RealTime Streaming
Ā 
101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)101 mistakes FINN.no has made with Kafka (Baksida meetup)
101 mistakes FINN.no has made with Kafka (Baksida meetup)
Ā 
Kafka Deep Dive
Kafka Deep DiveKafka Deep Dive
Kafka Deep Dive
Ā 
Using RAG to create your own Podcast conversations.pdf
Using RAG to create your own Podcast conversations.pdfUsing RAG to create your own Podcast conversations.pdf
Using RAG to create your own Podcast conversations.pdf
Ā 

Recently uploaded

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
Ā 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
Ā 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
Ā 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
Ā 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
Ā 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
Ā 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
Ā 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
Ā 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
Ā 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
Ā 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
Ā 
CHEAP Call Girls in Pushp Vihar (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR
Ā 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
Ā 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)
Ā 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
Ā 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto GonzƔlez Trastoy
Ā 
CALL ON āž„8923113531 šŸ”Call Girls Kakori Lucknow best sexual service Online ā˜‚ļø
CALL ON āž„8923113531 šŸ”Call Girls Kakori Lucknow best sexual service Online  ā˜‚ļøCALL ON āž„8923113531 šŸ”Call Girls Kakori Lucknow best sexual service Online  ā˜‚ļø
CALL ON āž„8923113531 šŸ”Call Girls Kakori Lucknow best sexual service Online ā˜‚ļøanilsa9823
Ā 
call girls in Vaishali (Ghaziabad) šŸ” >ą¼’8448380779 šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Vaishali (Ghaziabad) šŸ” >ą¼’8448380779 šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Vaishali (Ghaziabad) šŸ” >ą¼’8448380779 šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Vaishali (Ghaziabad) šŸ” >ą¼’8448380779 šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøDelhi Call girls
Ā 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
Ā 

Recently uploaded (20)

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Ā 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Ā 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
Ā 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Ā 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
Ā 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
Ā 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
Ā 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
Ā 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
Ā 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
Ā 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
Ā 
CHEAP Call Girls in Pushp Vihar (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )šŸ” 9953056974šŸ”(=)/CALL GIRLS SERVICE
Ā 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
Ā 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
Ā 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
Ā 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Ā 
Vip Call Girls Noida āž”ļø Delhi āž”ļø 9999965857 No Advance 24HRS Live
Vip Call Girls Noida āž”ļø Delhi āž”ļø 9999965857 No Advance 24HRS LiveVip Call Girls Noida āž”ļø Delhi āž”ļø 9999965857 No Advance 24HRS Live
Vip Call Girls Noida āž”ļø Delhi āž”ļø 9999965857 No Advance 24HRS Live
Ā 
CALL ON āž„8923113531 šŸ”Call Girls Kakori Lucknow best sexual service Online ā˜‚ļø
CALL ON āž„8923113531 šŸ”Call Girls Kakori Lucknow best sexual service Online  ā˜‚ļøCALL ON āž„8923113531 šŸ”Call Girls Kakori Lucknow best sexual service Online  ā˜‚ļø
CALL ON āž„8923113531 šŸ”Call Girls Kakori Lucknow best sexual service Online ā˜‚ļø
Ā 
call girls in Vaishali (Ghaziabad) šŸ” >ą¼’8448380779 šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Vaishali (Ghaziabad) šŸ” >ą¼’8448380779 šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļøcall girls in Vaishali (Ghaziabad) šŸ” >ą¼’8448380779 šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
call girls in Vaishali (Ghaziabad) šŸ” >ą¼’8448380779 šŸ” genuine Escort Service šŸ”āœ”ļøāœ”ļø
Ā 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
Ā 

Kafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift

  • 1. #kafkaā€Ø #storm @FifthEl ā€˜15 Iā€™m @bhaskerkode aka Boskyā€Ø Product Engg at Helpshift
  • 2. youā€™re here at a ā€Ø bigdata, ā€Ø & analytics conference to ā€¦
  • 5. Kafka is your glueā€Ø to have diļ¬€erent teams break down complex end-to-end problems into smaller more manageable onesā€Ø ā€Ø like a ā€œcheckpointā€ in a long hike/game/ drive seperation of concerns helps you design distributed systems
  • 6. kafka & stream processing power combo <insert quote to impress friend>
  • 8.
  • 9. the same may happen with bigdata & analytics dataā€Ø ingestion ofļ¬‚ine ā€Ø processing streamā€Ø processing dataā€Ø ingestion streamā€Ø processing
  • 12. mobile web HA iot internal biz ā€Ø logic typical analytics ingestion architecture dbā€Ø cacheā€Ø searchā€Ø infra
  • 14. how kafka helps meā€Ø sleep better how? letā€™s talk numbers
  • 15. ~300 million requests/dayto our events kafka producer; ekaf served by 22 cores or 11 servers running on auto-pilot for 1+ years little maintenance, only scale serversā€Ø zero-downtime support plenty of metrics (started with 3 servers replacing 14 JVM producers) separate auth logic ā€Ø from producer logic erlang = peaceful sleep open source (scalable)
  • 16. mo web iot inte modular design kafboy ekaf auth logic HTTP 500ā€Ø unauthorizedā€Ø wrong apiā€™s metrics HTTP 200
  • 17. modular design kafboy ekaf auth logic HTTP 500 unauthorized wrong apiā€™s metrics HTTP 200
  • 18. one thing particularly distinguishes kafka secret sauce? letā€™s take wild guesses
  • 19. community? NO letā€™s take more wild guesses at the secret sauce
  • 20. letā€™s illustrate with some examples
  • 21. You: ā€œCreate topic foo with 3 partitionsā€
  • 22. kafka: ā€œOK! each partition is set on a diļ¬€erent broker. one of the brokers is an elected leader,ā€Ø push your messages to the leader host, port please. if leader goes down, other broker is elected.ā€
  • 23. is it the fault-tolerance you get? NO letā€™s take more wild guesses at the secret sauce You: ā€œcreate topic fooā€ ā€Ø with 3 partitions partitionsā€Ø leadersā€Ø electionā€Ø ā€Ø
  • 24. You: ā€œhey broker1, where should i push to?ā€
  • 25. kafka: please dial 1-800-metadata ā€Ø to any broker. we all maintain this data. you can make a metadata request with topic(s) you want, and we will return it s partitions, and their broker hosts, ports, leader info, etc
  • 26. is it the separation of concerns? NO letā€™s take more wild guesses at the secret sauce You: ā€œhey broker1, where should i push to?ā€ metadata on all brokersā€Ø ā€Ø gives host & port of leader ā€Ø for partitions + more info
  • 27. You: ā€œhey partition1 on broker1, for topic1ā€Ø here are 10 messages.ā€
  • 28. kafka: ā€œthank you for calling partition1. ā€Ø i will append it to a dir/ļ¬le called topic1/ partition1.log. ā€Ø ā€Ø Since its append only like a commit log, ā€Ø you get ordering within partition free!ā€
  • 29. is it the producer speed you get? NO letā€™s take more wild guesses at the secret sauce You: ā€œhey partition1 on broker1, for topic1ā€Ø here are 10 messages.ā€ append only commit-logā€Ø ordering within partitionā€Ø make 1 partition for global orderingā€Ø
  • 30. You: ā€œhey broker2, where should i consume from?
  • 31. kafka: ā€œthank you for connecting to the right broker. you can now read any partition you want, and from any oļ¬€set. All i care about is which oļ¬€set to start reading from in a topic/ partition.log ļ¬leā€
  • 32. You: ā€œhey broker1, topic1, partition1ā€Ø i want all data from message oļ¬€setā€Ø ā€¦.10(from ZK) onwardsā€
  • 33. kafka: ā€œthank you for calling partition1. ā€Ø iā€™m going to sendļ¬le the bytes you asked from kernel-space directly to the socket of a consumer using zero copy, thus reducing context switches & minimal garbage collection. yes, i am badass.ā€
  • 34. is it the consumer speed you get? NO letā€™s take more wild guesses at the secret sauce You: ā€œhey broker1, topic1, partition1ā€Ø i want all data from message offsetā€Ø ā€¦.10(from ZK) onwardsā€ oļ¬€set bytesā€Ø sendļ¬le bytes from oļ¬€set to socketā€Ø kernel space not userspaceā€Ø
  • 35. You: ā€œfor topic1,ā€Ø i want 3 consumers all reading every msg. ā€Ø ā€Ø for topic2, ā€Ø i want the data split between 3 consumersā€
  • 36. kafka: ā€œ3 diļ¬€erent pipelines/actions ā€Ø on the same input topic1? Nice! ā€Ø I can see your team is growing.ā€Ø ā€Ø Thank you for grokking the concept of consumer groups topic2. Make sure all 3 of your use the same group-id, and iā€™ll take of the rest!ā€
  • 37. is it the consumer parallelism? NO letā€™s take more wild guesses at the secret sauce You: ā€œfor topic1, i want 3 consumers all reading every msg.ā€Ø for topic2, i want the data split between 3 consumersā€ can broadcast to all consumersā€Ø can split b/w a group of consumers
  • 38. binary protocol. but importantly ā€Ø a documented spec ofā€Ø what goes/comes over-the-wire the one thing particularly distinguishes kafka & gives it a stellar status in the ecosystem is its
  • 39. the creators of kafka built the brokers & the specā€Ø of how to communicate with it (producers & consumers), ā€Ø and let the community speak the protocol a documented spec ofā€Ø what goes/comes over-the-wire
  • 40. the creators of kafka built the brokers & the spec a documented spec ofā€Ø what goes/comes over-the-wire eg: do you know what is sent ā€Ø by a namenode/jobtracker/datanode over the wire?ā€Ø (PS: where is the spec? come meet me after)
  • 41. protocols win over apiā€™s/drivers the creators of kafka focussed on the spec If you knew what data a namenode/jobtracker/datanode actually communicates over the wire. It opens up a new world. allows diļ¬€ language clients to express themselves best ā€Ø ā€Ø its just data over TCP sockets.ā€Ø more freedomā€Ø ā€Ø more integrations & faster adoption
  • 42. 000300000000000ā€Ø 10007636c69656e7ā€Ø 4310000000100066ā€Ø 576656e7473 0,3,0,0,0,0,0,1, 0,7,99,108,ā€Ø 105,101,110,116, 49,0,0,0,1,0, 6,101,118,101,11 0,116,115 1. open a tcp socket to any kafka 0.8+ broker:port 2. send these 29 bytes that asks for metadata for topic ā€œeventsā€ these 29 bytes (or in hex) and youā€™llā€Ø get backā€Ø metadataā€Ø for the topicā€Ø ā€œeventsā€ā€Ø ā€Ø always.
  • 43. 0,3,ā€Ø 0,0,ā€Ø 0,0,0,1,ā€Ø 0,7,ā€Ø 99,108,105,101,110,116,49,ā€Ø 0,0,0,1,ā€Ø 0,6,ā€Ø 101,118,101,110,116,115 [2 bytes] metadata code = 3 [2 bytes] api version = 0ā€Ø [4 bytes] int id (for replies)ā€Ø [2 bytes] client id length = 7 [7 bytes] ā€œclient1ā€ā€Ø [4 bytes] no# of topics = 1 ā€Ø [2 bytes] topic[0] length = 6ā€Ø [6 bytes] ā€œeventsā€ā€Ø what this packet means
  • 44. <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> % 0,0,0,1, % this is a response to req id 1 % 0,0,0,2, % number of brokersā€Ø % 0,0,0,1, % broker[0] idā€Ø % then broker name, host, portā€Ø ā€Ø % 0,0,0,3, % topics len % 0,0,0,6, % topic[0] name len % ā€¦ā€¦ā€¦ā€¦ā€¦ā€¦.. % topic[0] name events % 0,0,0,2, % topic[0] partitions len % 0,0, % topic[0] partition1 error code % 0,0,0,0, % topic[0] partition1 % 0,0,0,1 % topic[0] partition1 leaderid % 0,0,0,1, % topic[0] partition1 replicas len % 0,0,0,3, % topic[0] partition1 replica1 % 0,0,0,1, % topic[0] partition1 isr len % 0,0,0,3, % topic[0] partition1 isr1 % another partition data here % etc metadataā€Ø responseā€Ø decoded
  • 45. similiarly all other operations encode the request as a packet ā€Ø send over tcp socket sync produce handle response <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> async produce <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> no response
  • 46. MessageSet => [Offset MessageSize Message] Offset => int64 MessageSize => int32 Messagesā€Ø Message => Crc MagicByte Attributes Key Value Crc => int32 MagicByte => int8 Attributes => int8 Key => bytes Value => bytes
  • 47. the joy of knowing how your data is encoded and sent over tcp socket keeps things simple, lets you sleep betterā€Ø easier to debug, test, add middle-wares to audit, etc MQTT endoded messages crypto encoded <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> thrift/avro encoded messages <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> AOL/YAHOO/* packets <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
  • 48. brokers focus on the commit logs, etc and delegate several opinionated areasā€Ø to the client, onus is on the client to make smart decisions compression queueā€™ing <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> detect downtime <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>> load balancing to partitions <<0,0,0,0,0,0,0,3,0,0,0,3,0,25,118,97,103,114,97,110,116, 45,117,98,117,110,116,117,45,112,114,101,99,105,115,101, 45,54,52,0,0,35,133,0,0,0,1,0,25,118,97,103,114,97,110, 116,45,117,98,117,110,116,117,45,112,114,101,99,105,115, 101,45,54,52,0,0,35,131,0,0,0,2,0,25,118,97,103,114,97, 110,116,45,117,98,117,110,116,117,45,112,114,101,99,105, 115,101,45,54,52,0,0,35,132,0,0,0,3,0,0,0,2,97,49,0,0,0, 2,0,0,0,0,0,0,0,0,0,3,0,0,0,1,0,0,0,3,0,0,0,1,0,0,0,3,0, 0,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0,1,0,0,0, 2,97,50,0,0,0,2,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,0, 0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,1, 0,0,0,2,0,0,0,2,97,51,0,0,0,2,0,0,0,0,0,0,0,0,0,3,0,0,0, 2,0,0,0,3,0,0,0,1,0,0,0,2,0,0,0,3,0,0,0,1,0,0,0,0,0,1,0, 0,0,1,0,0,0,2,0,0,0,1,0,0,0,2,0,0,0,2,0,0,0,1,0,0,0,2>>
  • 49. clients keeps things simpleā€Ø a subset of what clients need to do include: 1. bootstrap 2. open sockets 3. encode packets 4. decode packets 5. send over tcp 6. route responses 7. handle failures, events 8. state machines
  • 50. distributed systems ā€Ø need to be built ā€Ø as state machines donā€™t trust a client that blocksā€Ø until an operation is complete
  • 51. distributed systems ā€Ø need to be built ā€Ø as state machines // sendā€Ø // wait for responseā€Ø ā€Ø BADā€Ø even if this thread/processā€Ø is doing nothing until timeoutā€Ø the idling is not eļ¬ƒcient
  • 52. distributed systems ā€Ø need to be built ā€Ø as state machines // listening for socket statesā€Ø // listening for responses ā€Ø // send, and continueā€Ø // on_recv, route responseā€Ø // after_timeout, route timeoutā€Ø ā€Ø GOOD
  • 53. distributed systems ā€Ø need to be built ā€Ø as state machines set concurrency options for ekafā€Ø set the hostname of a load balancer over your brokers http://github.com/helpshift/ekaf
  • 54. distributed systems ā€Ø need to be built ā€Ø as state machines ekaf hits the ground running with 1 call // publish(topic, message) ā€Ø if no state machine, it flows from ā€Ø request metadata -> worker pool creation ā€Ø -> socket connecting -> ready state http://github.com/helpshift/ekaf
  • 55. distributed systems ā€Ø need to be built ā€Ø as state machines if topic state machine has metadataā€Ø it knows which broker for each partitionā€Ø if state already has socket, queue it http://github.com/helpshift/ekaf
  • 56. distributed systems ā€Ø need to be built ā€Ø as state machines all messages in states before ready areā€Ø queued. if queue hits size ā€Ø OR ā€Ø hits flush timeout. send it http://github.com/helpshift/ekaf
  • 58. tests include broker downtime , adding a broker, etc & a mini kafka broker for tests http://github.com/helpshift/kafkamocker
  • 59. flushed queue worker up worker down downtime saved downtime replayed time to connect max downtime q Callbacks ā€Ø used for Metrics
  • 60. last year 6k/minā€Ø now 6k/secā€Ø ā€Ø only scaled up ā€Ø servers
  • 61. hello worldā€Ø driven ā€Ø development response time ā€Ø against endpointā€Ø to just echo
  • 62. ekaf @Layer (ex-Apple engineers) ā€Ø ā€œThe art of powering the Internetā€™s next messaging systemā€ https://www.youtube.com/watch? v=mv2MBYU8Yls#t=33m5s
  • 63. ekaf@ a chinese social networkā€Ø 1 pull request about to be merged and elsewhere
  • 64. back to pipelines involving a kafka producer and consumer
  • 65. SDKā€Ø nginx auth/apiā€Ø #erlang active user analytics pipeline @helpshift PGā€Ø ā€Ø S3 kafkaā€Ø httpā€Ø producerā€Ø #kafboy (uses ekaf) to diskā€Ø hyperloglogā€Ø countsā€Ø #clojure kafka consumerā€Ø #clj-kafka ~1 billionā€Ø devices HA EMRā€Ø (internal jobs) (dashboards)
  • 67. [WIP] ES indexing @helpshift actually indexed #clj-kafka ES bulk index docs
  • 68. audit/action trails @helpshift PG #clj-kafka kafka consumerā€Ø #clj-kafka old object new object diff emit/ignoreā€Ø rows few rules objects are ā€Ø namespace must have id
  • 70.
  • 71. iTunesā€Ø ā€Ø Play reviewsā€Ø distributedā€Ø crawler #goā€Ø #masterā€Ø #worker farmā€Ø #controller the reviews storm pipeline @helpshift PG kafkaā€Ø producerā€Ø #shopify/sarama deduplicationā€Ø tokenizationā€Ø topic extraction sentiment analysisā€Ø stormā€Ø kafka spout
  • 72. example storm topologyā€Ø read up more on spouts and bolts (any Qā€™s?)
  • 73. ā€¢ PG and multiple bolts tip ā€¢ Metrics at every bolt ā€¢ Local statsite -> grafana ā€¢ Avoid metric explosionā€Ø instrumenting tips
  • 74. [WIP] segment population query ESā€Ø ā€Ø job ā€Ø tracking #clj-kafka kafka consumer / storm elasticsearch ā€Ø queryā€Ø representingā€Ø segment scheduler S3 countsā€Ø in PG ā€œļ¬nd users who match level 2 , who did not ļ¬nd the easter eggā€
  • 75. ļ¬t your use caseā€Ø population count moving average a note on samzaā€™s state
  • 76. Kafka is your glueā€Ø to have diļ¬€erent teams break down complex end-to-end problems into smaller more manageable onesā€Ø ā€Ø like a ā€œcheckpointā€ in a long hike/game/ drive seperation of concerns helps you design distributed systems
  • 77. numbers your job good sleep your product kafka + storm
  • 78. Small Snapshot of Helpshift Hay Day Boom Beach Clash of Clans Deer Hunter High School Story Family GuyFlipboard Circa Wordpress Misļ¬t Microsoft Outlook APP + API DB MONITORING OTHER ROUTING HAProxy Our SDK is being embedded in a growing list of popular apps
  • 79. #kafkaā€Ø #storm Thanks! Iā€™m @bhaskerkode aka Boskyā€Ø Product Engg @Helpshift bosky@helpshift.com Find this talk atā€Ø http://bit.ly/ļ¬fthel15-kafka-storm