Contenu connexe Similaire à Blr hadoop meetup (20) Blr hadoop meetup1. © 2016 24/7 CUSTOMER, INC.
BIG DATA BANGALORE JAN MEETUP - 24/7 CUSTOMER, INC.
Recipes for building resilient cross-
DC data pipeline with Kafka
Sr. Engineering Manager - Big Data
Platform
Suneet Grover
3. © 2016 24/7 CUSTOMER, INC.
Today’s engagement is not driving successful moments
3
Q&A
IVR
4. © 2016 24/7 CUSTOMER, INC.
Smart Customer Engagement
4
Data-Driven
Reflecting All
Available Data
Click here to see [24]7 in action
Video available at http://player.vimeo.com/video/85280070
Predictive
Real-time
Decisions
Omni-channel
Across Digital
& Voice
Personalized
User Experience
5. © 2016 24/7 CUSTOMER, INC.
Intent-driven
engagement
Anticipate consumer intent
Holistic experience across channels
Delivering the right moments
to
Move from
Channel-centric
engagement
Reacting to consumer behavior
Disconnected, fragmented channels
Too many failed experiences
5
6. © 2016 24/7 CUSTOMER, INC.
[24]7 by the numbers
6
1.2bsmart speech
calls/year
127mvirtual agent
inquiries/year
30magent
chats/year
341mweb visitors
/month
5000+digital chat agents
(#1 WW)
70+data scientists
(most in industry)
100+patents
300+software engineers &
designers
7. © 2016 24/7 CUSTOMER, INC.
Agenda
• Introduction to Kafka
• Kafka at [24]7
• From problems to solutions
• Transparency and Resiliency
• Metrics Demo
• Design for multiple data centers
7
9. © 2016 24/7 CUSTOMER, INC.
Apache Kafka
• Distributed
• High performance and throughput
• Streaming platform, pub/sub system
9
12. © 2016 24/7 CUSTOMER, INC.
Kafka setup across DCs
12
Brokers
Region1 Region 2
Mirrormakers
Zookeepers
Brokers
Mirrormakers
Zookeepers
14. © 2016 24/7 CUSTOMER, INC. 14
Intent Prediction
Data Analytics
Business Intelligence
15. © 2016 24/7 CUSTOMER, INC. 15
From problems to solutions
16. © 2016 24/7 CUSTOMER, INC.
Challenges with Kafka 0.8.0
• Broker partition stickiness does not allow to scale
• ZK load and latencies keep increasing
• Range based mirror-maker algorithm not optimal
• Stale topics cannot be deleted
• Controller can get into a stuck state
• Conflict errors in mirror-makers
• Socket leaks leading to open file descriptors
16
17. © 2016 24/7 CUSTOMER, INC.
Learnings from Kafka 0.8.0
• If the controller gets into a stuck state, delete the “/controller”
node from zookeeper
• Always do clean shutdown and restart of brokers
• Some issues are not always visible as errors or warnings
• Run ZK on SSD
17
18. © 2016 24/7 CUSTOMER, INC.
Kafka 0.10
18
• Very stable release
• Easy to do in-place from 0.8.2 onwards
• Better client APIs
• Richer admin operations
19. © 2016 24/7 CUSTOMER, INC.
Broker configurations that worked for us
19
• default.replication.factor = 3
• num.partitions = 2
• delete.topic.enable = true
• auto.leader.rebalance.enable = true
• controlled.shutdown.enable = true
• queued.max.requests = 1000
20. © 2016 24/7 CUSTOMER, INC.© 2016 24/7 CUSTOMER, INC. 20
Transparency and
Resiliency
21. © 2016 24/7 CUSTOMER, INC.
Metrics flow
21
Grafana
Graphite
Kafka Broker
Metrics Reporter
Kafka MM JMXTrans
Zookeeper
Host level
Metrics & Alerts
Lag monitor
ELK
22. © 2016 24/7 CUSTOMER, INC.
Essential Broker Metrics
• Disk, CPU and throughput utilization
• Ingress, egress volume per broker and topic
• Active controller count
• Offline partitions
• Under replicated partitions
• Partitions per broker
• Log flush rate
22
23. © 2016 24/7 CUSTOMER, INC.
Basic Alerts
23
• Disk, CPU utilization
• Open file handles
• Controller count
• Controller re-elections
• Under replicated partitions
• Offline partitions
• Stuck pending commands in zookeeper
• Conflicts in mirror-makers
24. © 2016 24/7 CUSTOMER, INC.
JMXTrans
24
• Push mirror-maker metrics to graphite
• Throughput per topic, per thread, per instance etc.
• WaitOnTake, WaitOnPut
• Push zookeeper metrics to graphite
• Latency, quorum, connections etc.
25. © 2016 24/7 CUSTOMER, INC.
Data Lag Monitoring
25
• Measures the event level time delay
• Monitors data latencies per cluster, per topic, per partition
• Latencies between multiple steps in Kafka pipeline
• Optimize and configure sampling ratio
• Supports multiple message formats json, avro etc.
• Alerts based on pre-defined thresholds
26. © 2016 24/7 CUSTOMER, INC.
Indicative Broker Metrics
• Request Metrics
• Local Time
• Remote Time
• Queue Time
• Request Handler Idle Percent
• Network Processor Idle Percent
26
28. © 2016 24/7 CUSTOMER, INC.© 2016 24/7 CUSTOMER, INC. 28
Design for Multiple Data
Centers
29. © 2016 24/7 CUSTOMER, INC.
Range Based Mirror Makers
1000
181
14
5
1
10
100
1000
Consumer 1 Consumer 2 Consumer 3 Consumer 4
Skewed Partition Assignment
Num Partitions
29
30. © 2016 24/7 CUSTOMER, INC.
Round Robin Mirror Makers
0
50
100
150
200
250
300
350
Consumer 1 Consumer 2 Consumer 3 Consumer 4
Uniform Partition Assignment
Num Partitions
30
31. © 2016 24/7 CUSTOMER, INC.
Mirror-maker fine tuning
• Round Robin works better than Range based in most cases
• Spread out the topics in multiple MM consumer groups
• If you have a few large volume topics
• Negative regex works with whitelist parameter
• Doesn’t help to have too many MM consumer threads
• Tune socket buffer size (doesn’t apply unless OS allows)
• MM - socket.receive.buffer.bytes = 1048576
• Broker - socket.send.buffer.bytes = 1048576
31
32. © 2016 24/7 CUSTOMER, INC. 32
We are hiring!!!
For current open positions, please log onto our careers web page
http://www.247-inc.com/
Company>Careers>Location
For further details, Please reach out to:
Achappa C B - achappa.cb@247-inc.com, M: +91-7338458247
Notes de l'éditeur Intro to yourself
Credit to the team
Click and let this do the slow build. The key points are:
Consumers find it frustrating to cross channels (web, phone, IVR, etc.) because their content is not preserved.
So they have to do things like authenticate (user ID, password) multiple times in the same interaction.
These types of experiences turn potential Brand Advocates into Detractors who will move to other brands.
This a KEY slide. Emphasize that today’s leading-edge companies – those that consumers love to engage with and have strong brands – are moving to Intent Driven Engagement
Would like viewers read the slide. Then focus on bottom row:
300+ software engineers and designers
Most data scientists in the industry
100+ patents point made earlier
We are the #1 provider of digital chat agents in the world.
Other features which we haven’t tried Security, Streams etc.