In this meetup, Kobi Salant - Data Platform Technical Lead & Vladi Feigin - Data System Architect, both from Liveperson will talk about : Making scale a non-issue for real-time Data apps.
Have you ever tried to build a system processing in real-time hundreds of thousands events per second and servicing more than 1M concurrent visitors?
We're going to talk about the LivePerson real-time stream processing solution doing exactly that. Learn how we empower digital call centers with insights for their critical decision making processes and never-ending efficiency goals.
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
Liveperson DLD 2015
1. DLD. Tel-Aviv. 2015
Making Scale a Non-Issue
for Real-Time Data Apps
Vladi Feigin, LivePerson
Kobi Salant, LivePerson
2. Agenda
Intro
About LivePerson
Digital Engagements
Call Center Use Case
Architecture
Zoom-In
3. Bio
Vladi Feigin
System Architect in LivePerson
18 years in software development
Interests : distributed computing, data, analytics and
martial arts
4. Bio
Kobi Salant
Data Platform Tech Lead in LivePerson
25 years in software development
Interests : Application performance, traveling and coffee
5. LivePerson
We do Digital Engagements
Agile and very technological
Real Big Data and Analytics company
Really cool place to work in
One of the SaaS pioneers
6 Data Centers across the world
Founded in 1995,
a public company
since 2000
(NASDAQ: LPSN)
More than 18,000
customers
worldwide
More than 1000
employees
7. We are Big Data
1.4 Million concurrent visits
1 Million events per second
2 billion site visits per month
27 million live engagements per month
Data freshness SLA (RT flow): up to 5 seconds
12. Call Center Operating
Digital engagement requires operating a call center in the
most efficient way
How to operate a call center in the most efficient way?
Provide operational metrics … In real-time
What are the challenges?
Huge scale, load peaks, real-time calculations, high data
freshness SLA
16. Data Producers. Requirements
Real time
“Five nines” persistence
Small footprint
No interference with service
Multiple producers & platforms
Monolithic to service oriented
Many
More
Services
17. Data Producers. Lessons learned
Hundreds of services
Complex rollouts
Minimal logic to avoid painful fixes
Audit streaming? Split to buckets
Real time and “five nines” persistence are incompatible
In House
1
Bucket Bucket
18. Consistent
Topic
Send message
to Kafka
local file
Persist message to
local disk
Kafka Bridge
Send message
to Kafka
Fast
Topic
Kafka Resilience
Real-time
Customers
Offline
Customers
Kafka
Data Producers. Flow
19. Data Model Framework
Why Avro:
Schema based evolution
Performance - Untagged bytes
HDFS ecosystem support
Lessons Learned:
Schema evolution breaks
Big schema (ours is over 65k) not recommended
Avoid deep nesting and multiple unions
Need a framework
Chaos – Non-Schema
space delimited
Order – Avro Schema
20. Framework Flow
1. Event is created according to Avro
Schema version 3.5
2. Schema is registered into the
repository (once)
3. Value 3.5 is written to header
4. Event is encoded with schema
version 3.5 and added to message
5. Message is sent to Kafka
6. Message is read by consumer
7. Header is read from message
8. Schema is retrieved from repository
according to scheme version
9. Event decoded using the proper Avro
schema
10.Decoded event is processed
3.5
3.5
Consumer
Repository
21. Apache Kafka
More than 15 billion events a day
More than 1 million events per second
Hundreds of producers & consumers
Why Kafka?
Scale where traditional MQs fail
Industry standard for big data log messaging
Reliable, flexible and easy to use
Deployment:
We have 15 clusters across the world
Our biggest cluster has 8 nodes with more than 6TB (Avro + Kafka
compression)
Maximum retention of 72 hours
22. Apache Kafka. Lessons Learned
Scale horizontally for hardware resources and vertically for
throughput
Look at trends of network & IO & Kafka's JMX statistics
Partitions Servers
Bytes in
23. Apache Kafka. Lessons Learned cont.
Know your data and message sizes:
Large messages can break you
Data growth can overfill your capacity
Set the right configuration
Adding or removing a broker is not trivial
Decide on single or multiple topics
24. Apache Storm
Why Storm?
Growing community with good integration to Kafka
At the time, it was the leading product
Easy development and customization
The POC was successful
Deployment:
We have 6 clusters across the world
Our biggest cluster has more then 30 nodes
We have 20 topologies on a single cluster
Uptime of months for a single topology
26. Apache Storm. Lessons learned
Develop SDK and educate R&D
Where did my topology run last week? What is my overtime
capacity?
Know your bolts, must return a timely answer
Coding is easy, performance is hard
Use isolation
Capacity
27. Apache Storm. Lessons learned cont.
Use local shuffling
Use Ack
KAFKA SPOUT FILTER BOLT WRITER BOLT
KAFKA SPOUT FILTER BOLT WRITER BOLT
Local
emit
ACKER BOLT
ACKER BOLT
COMM BOLT
COMM BOLT
Worker
A
Worker
B
Local
emit
Local
emit
Local
emit
28. Summary
No one-size-fits-all solution
Ask product for a clearly defined SLA
Separate between fast and consistent data flows - they
don’t merge!
Use schema for a data model - keep it flat and small
Kafka rules! It’s reliable and fast - use it
Storm has it’s toll. For some use-cases we would be
using Spark Streaming today
29. THANK YOU!
We are hiring
http://www.liveperson.com/company/careers
Q/A