SlideShare une entreprise Scribd logo
1  sur  17
Apache Avro in LivePerson
Collecting and saving data is easy
keeping it consistent is tough
DevCon Tlv, June 2014
Amihay Zer-Kavod, Software Architect
Who am I?
Amihay Zer-Kavod
Software Architect
Been in software Since 1989
LivePerson Echo System
M/R
● Consistent but decoupled communication
between services, such as:
o Monitoring, Interaction
o Predictive, Sentiment
o Reporting & Analysis
o History
Communication & Meaning
event
evento
事件
घटना
‫حدث‬
‫ארוע‬
событие
● Consistent meaning over time
o BigData Store (Hadoop)
o Reporting
What can’t we use?
Don’t use Direct APIs!
They are completely wrong for this issue, since:
• They produce too much coupling between services
• APIs are synchronous by nature
• Adds irrelevant complexity to the called service
So what is needed?
The Message is the API!
● A unified event model (schema) for all reported events
● Management tools for the unified schema
● Tools for sending events over the wire
● Tools for reading/writing event in big data
● Backward and forward compatibility
The Event model
From generic to specific structure with:
• Common header - all common data to all events
• Logical Entities - common header to all logical entities
(such as Visitor)
• Dynamic Specific headers
• Specific Event body
Apache Avro to the rescue
● Avro - a schema based serialization/deserialization
framework
● Avro idl - schema definition language
● Avro file - Hadoop integration
● Avro schema resolution
● Apache Avro created by Doug Cutting
Avro JSON schema sample
{
"type": "record",
"name": "Event",
"namespace": "com.liveperson.example",
"doc": "Example event",
"fields":[{ "name": "version", "type": "string", "default": "1" },
{ "name": "id", "type": "string", "default": "Unknown"},
{"name": "time","type": "long","default": -1},
{"name": "body","type": "string","default": "no body"},
{"name": "color","type":
{ "type": "enum", "name": "Color",
"symbols": ["NO_COLOR", "BLUE", "BLACK", "WHITE", "PINK"] },
"default": "NO_COLOR" }
]
}
Avro IDL - LivePerson Event
/** Base for all LivePerson Events
*/
@namespace("com.liveperson.global")
record LPEvent {
/** Common Header of the event */
CommonHeader header = null;
/** Logical entity details participating in this event - Visitor, Agent, etc... */
array<Participant> participants = null;
/** Holding specific platform info as node name (machine) cluster Id etc... */
PlatformHeader platformSpecificHeader = null;
/** Auditing Header, Optional - adds data for auditing of the events flow in the platform*/
union {null, AuditingHeader } auditingHeader = null;
/** The event body */
EventBody eventBody = null;
}
Backward & Forward Compatibility
Avro schema evolution
● Avro supports two schemes resolution
● Need to follow a set of rules:
● Every field must have a default value
● A field can be added (make sure to put a default value)
● Field types can not be changed (add a new field
instead)
● enum symbols can be added but never removed
Is that enough?
M/R
Migdalor
How good does it work?
● Cyber Monday 2013 (one day)
o More than 320,000 events per second
o 7 Storm topologies consuming the events seconds from
real time
o 2TB of data saved to Hadoop
● 2014 preparation:
o x2 number of events per second to ~640,000
So how did we do it?
1. Use an event driven system, don’t use direct APIs
2. Create a unified schema for all events
3. Use Avro to implement the schema
4. Add some supporting infrastructure
????
Questions
event
evento
事件
घटना
‫حدث‬
‫ארוע‬
событие
Amihay Zer-Kavod
You can contact me at:
amihayz@liveperson.com
LivePerson is hiring!
Thank You

Contenu connexe

Tendances

Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to Thrift
Dvir Volk
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
Alex Tumanoff
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical Platform
Howard Mansell
 

Tendances (18)

Introduction to Thrift
Introduction to ThriftIntroduction to Thrift
Introduction to Thrift
 
Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)Serialization (Avro, Message Pack, Kryo)
Serialization (Avro, Message Pack, Kryo)
 
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache AiravataRESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
 
Google Protocol Buffers
Google Protocol BuffersGoogle Protocol Buffers
Google Protocol Buffers
 
Data Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersData Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol Buffers
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on android
 
Dart programming language
Dart programming languageDart programming language
Dart programming language
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
 
Rest style web services (google protocol buffers) prasad nirantar
Rest style web services (google protocol buffers)   prasad nirantarRest style web services (google protocol buffers)   prasad nirantar
Rest style web services (google protocol buffers) prasad nirantar
 
Apache Thrift, a brief introduction
Apache Thrift, a brief introductionApache Thrift, a brief introduction
Apache Thrift, a brief introduction
 
F# Type Provider for R Statistical Platform
F# Type Provider for R Statistical PlatformF# Type Provider for R Statistical Platform
F# Type Provider for R Statistical Platform
 
Php
PhpPhp
Php
 
Presentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStackPresentation of Python, Django, DockerStack
Presentation of Python, Django, DockerStack
 
Hack and HHVM
Hack and HHVMHack and HHVM
Hack and HHVM
 
Extending the Xbase Typesystem
Extending the Xbase TypesystemExtending the Xbase Typesystem
Extending the Xbase Typesystem
 
Dart the better Javascript 2015
Dart the better Javascript 2015Dart the better Javascript 2015
Dart the better Javascript 2015
 
System Programming and Administration
System Programming and AdministrationSystem Programming and Administration
System Programming and Administration
 
Apache Thrift : One Stop Solution for Cross Language Communication
Apache Thrift : One Stop Solution for Cross Language CommunicationApache Thrift : One Stop Solution for Cross Language Communication
Apache Thrift : One Stop Solution for Cross Language Communication
 

En vedette

Avro Data | Washington DC HUG
Avro Data | Washington DC HUGAvro Data | Washington DC HUG
Avro Data | Washington DC HUG
Cloudera, Inc.
 
デブサミ2014 個人スポンサー募集要項
デブサミ2014 個人スポンサー募集要項デブサミ2014 個人スポンサー募集要項
デブサミ2014 個人スポンサー募集要項
Developers Summit
 
Pengeualaran Daerah Efektif
Pengeualaran Daerah EfektifPengeualaran Daerah Efektif
Pengeualaran Daerah Efektif
guest5fc123f
 
The Law of Averages, Chapter 2: The Not-So-Average Family
The Law of Averages, Chapter 2: The Not-So-Average FamilyThe Law of Averages, Chapter 2: The Not-So-Average Family
The Law of Averages, Chapter 2: The Not-So-Average Family
Nerissaemerald
 
UB0203: Big 4 Pattern
UB0203: Big 4 PatternUB0203: Big 4 Pattern
UB0203: Big 4 Pattern
Konevo311
 
Why Iocom Video Conferencing
Why Iocom Video ConferencingWhy Iocom Video Conferencing
Why Iocom Video Conferencing
MarilynBlanchard
 
Cjv30 Daily Care D201896 Ver1 0
Cjv30 Daily Care D201896 Ver1 0Cjv30 Daily Care D201896 Ver1 0
Cjv30 Daily Care D201896 Ver1 0
guest7c3c32
 

En vedette (20)

Avro Data | Washington DC HUG
Avro Data | Washington DC HUGAvro Data | Washington DC HUG
Avro Data | Washington DC HUG
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...Type safe, versioned, and rewindable stream processing  with  Apache {Avro, K...
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
 
Apache Avro and You
Apache Avro and YouApache Avro and You
Apache Avro and You
 
3 apache-avro
3 apache-avro3 apache-avro
3 apache-avro
 
Apache Flume
Apache FlumeApache Flume
Apache Flume
 
Topfield
TopfieldTopfield
Topfield
 
Ruth agnew presentation Modern Governor #GovernorLive 25062013
Ruth agnew presentation Modern Governor #GovernorLive 25062013Ruth agnew presentation Modern Governor #GovernorLive 25062013
Ruth agnew presentation Modern Governor #GovernorLive 25062013
 
デブサミ2014 個人スポンサー募集要項
デブサミ2014 個人スポンサー募集要項デブサミ2014 個人スポンサー募集要項
デブサミ2014 個人スポンサー募集要項
 
Opensat
OpensatOpensat
Opensat
 
Arrangementen bij Galerie Des Beaux Arts
Arrangementen bij Galerie Des Beaux ArtsArrangementen bij Galerie Des Beaux Arts
Arrangementen bij Galerie Des Beaux Arts
 
Pengeualaran Daerah Efektif
Pengeualaran Daerah EfektifPengeualaran Daerah Efektif
Pengeualaran Daerah Efektif
 
Installing oracle database 11g on windows 7
Installing oracle database 11g on windows 7Installing oracle database 11g on windows 7
Installing oracle database 11g on windows 7
 
The Law of Averages, Chapter 2: The Not-So-Average Family
The Law of Averages, Chapter 2: The Not-So-Average FamilyThe Law of Averages, Chapter 2: The Not-So-Average Family
The Law of Averages, Chapter 2: The Not-So-Average Family
 
Эффективное использование социальных сетей для развития интернет-магазина
Эффективное использование социальных сетей для развития интернет-магазинаЭффективное использование социальных сетей для развития интернет-магазина
Эффективное использование социальных сетей для развития интернет-магазина
 
NVN7125, berekenen energiebesparende gebiedsmaatregelen
NVN7125, berekenen energiebesparende gebiedsmaatregelenNVN7125, berekenen energiebesparende gebiedsmaatregelen
NVN7125, berekenen energiebesparende gebiedsmaatregelen
 
UB0203: Big 4 Pattern
UB0203: Big 4 PatternUB0203: Big 4 Pattern
UB0203: Big 4 Pattern
 
Nhom 3
Nhom 3Nhom 3
Nhom 3
 
Why Iocom Video Conferencing
Why Iocom Video ConferencingWhy Iocom Video Conferencing
Why Iocom Video Conferencing
 
Cjv30 Daily Care D201896 Ver1 0
Cjv30 Daily Care D201896 Ver1 0Cjv30 Daily Care D201896 Ver1 0
Cjv30 Daily Care D201896 Ver1 0
 
Social Media Tips - MEPRA
Social Media Tips - MEPRASocial Media Tips - MEPRA
Social Media Tips - MEPRA
 

Similaire à Apache Avro and Messaging at Scale in LivePerson

Tech Talk: ONOS- A Distributed SDN Network Operating System
Tech Talk: ONOS- A Distributed SDN Network Operating SystemTech Talk: ONOS- A Distributed SDN Network Operating System
Tech Talk: ONOS- A Distributed SDN Network Operating System
nvirters
 
Pref Presentation (2)
Pref Presentation (2)Pref Presentation (2)
Pref Presentation (2)
Prachi Patil
 

Similaire à Apache Avro and Messaging at Scale in LivePerson (20)

S4: Distributed Stream Computing Platform
S4: Distributed Stream Computing PlatformS4: Distributed Stream Computing Platform
S4: Distributed Stream Computing Platform
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
 
Tech Talk: ONOS- A Distributed SDN Network Operating System
Tech Talk: ONOS- A Distributed SDN Network Operating SystemTech Talk: ONOS- A Distributed SDN Network Operating System
Tech Talk: ONOS- A Distributed SDN Network Operating System
 
.conf2011: Web Analytics Throwdown: with NPR and Intuit
.conf2011: Web Analytics Throwdown: with NPR and Intuit.conf2011: Web Analytics Throwdown: with NPR and Intuit
.conf2011: Web Analytics Throwdown: with NPR and Intuit
 
How to Create a Service in Choreo
How to Create a Service in ChoreoHow to Create a Service in Choreo
How to Create a Service in Choreo
 
Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
 
PROCESS WARP
PROCESS WARPPROCESS WARP
PROCESS WARP
 
Schemas Beyond The Edge
Schemas Beyond The EdgeSchemas Beyond The Edge
Schemas Beyond The Edge
 
FIWARE Tech Summit - lwM2M IoT Agent in Depth
FIWARE Tech Summit - lwM2M IoT Agent in DepthFIWARE Tech Summit - lwM2M IoT Agent in Depth
FIWARE Tech Summit - lwM2M IoT Agent in Depth
 
Uni w pachube 111108
Uni w pachube 111108Uni w pachube 111108
Uni w pachube 111108
 
Cytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis ToolsCytoscape and External Data Analysis Tools
Cytoscape and External Data Analysis Tools
 
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE PerseoCreating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnel
 
Pref Presentation (2)
Pref Presentation (2)Pref Presentation (2)
Pref Presentation (2)
 
Flow Monitoring Tools, What do we have, What do we need?
Flow Monitoring Tools, What do we have, What do we need?Flow Monitoring Tools, What do we have, What do we need?
Flow Monitoring Tools, What do we have, What do we need?
 
project_docs
project_docsproject_docs
project_docs
 
Java Performance & Profiling
Java Performance & ProfilingJava Performance & Profiling
Java Performance & Profiling
 
Spring on PAS - Fabio Marinelli
Spring on PAS - Fabio MarinelliSpring on PAS - Fabio Marinelli
Spring on PAS - Fabio Marinelli
 
MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging MacSysAdmin Conference 2019 - Logging
MacSysAdmin Conference 2019 - Logging
 

Plus de LivePerson

Plus de LivePerson (20)

Microservices on top of kafka
Microservices on top of kafkaMicroservices on top of kafka
Microservices on top of kafka
 
Graph QL Introduction
Graph QL IntroductionGraph QL Introduction
Graph QL Introduction
 
Kubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platformKubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platform
 
Growing into a proactive Data Platform
Growing into a proactive Data PlatformGrowing into a proactive Data Platform
Growing into a proactive Data Platform
 
Measure() or die()
Measure() or die() Measure() or die()
Measure() or die()
 
Resilience from Theory to Practice
Resilience from Theory to PracticeResilience from Theory to Practice
Resilience from Theory to Practice
 
System Revolution- How We Did It
System Revolution- How We Did It System Revolution- How We Did It
System Revolution- How We Did It
 
Liveperson DLD 2015
Liveperson DLD 2015 Liveperson DLD 2015
Liveperson DLD 2015
 
Http 2: Should I care?
Http 2: Should I care?Http 2: Should I care?
Http 2: Should I care?
 
Mobile app real-time content modifications using websockets
Mobile app real-time content modifications using websocketsMobile app real-time content modifications using websockets
Mobile app real-time content modifications using websockets
 
Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices
 
Functional programming with Java 8
Functional programming with Java 8Functional programming with Java 8
Functional programming with Java 8
 
Data compression in Modern Application
Data compression in Modern ApplicationData compression in Modern Application
Data compression in Modern Application
 
Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API Support Office Hour Webinar - LivePerson API
Support Office Hour Webinar - LivePerson API
 
SIP - Introduction to SIP Protocol
SIP - Introduction to SIP ProtocolSIP - Introduction to SIP Protocol
SIP - Introduction to SIP Protocol
 
Scalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceScalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduce
 
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
From a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonFrom a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePerson
 
How can A/B testing go wrong?
How can A/B testing go wrong?How can A/B testing go wrong?
How can A/B testing go wrong?
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Dernier (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Apache Avro and Messaging at Scale in LivePerson

  • 1. Apache Avro in LivePerson Collecting and saving data is easy keeping it consistent is tough DevCon Tlv, June 2014 Amihay Zer-Kavod, Software Architect
  • 2. Who am I? Amihay Zer-Kavod Software Architect Been in software Since 1989
  • 4. ● Consistent but decoupled communication between services, such as: o Monitoring, Interaction o Predictive, Sentiment o Reporting & Analysis o History Communication & Meaning event evento 事件 घटना ‫حدث‬ ‫ארוע‬ событие ● Consistent meaning over time o BigData Store (Hadoop) o Reporting
  • 5. What can’t we use? Don’t use Direct APIs! They are completely wrong for this issue, since: • They produce too much coupling between services • APIs are synchronous by nature • Adds irrelevant complexity to the called service
  • 6. So what is needed? The Message is the API! ● A unified event model (schema) for all reported events ● Management tools for the unified schema ● Tools for sending events over the wire ● Tools for reading/writing event in big data ● Backward and forward compatibility
  • 7. The Event model From generic to specific structure with: • Common header - all common data to all events • Logical Entities - common header to all logical entities (such as Visitor) • Dynamic Specific headers • Specific Event body
  • 8. Apache Avro to the rescue ● Avro - a schema based serialization/deserialization framework ● Avro idl - schema definition language ● Avro file - Hadoop integration ● Avro schema resolution ● Apache Avro created by Doug Cutting
  • 9. Avro JSON schema sample { "type": "record", "name": "Event", "namespace": "com.liveperson.example", "doc": "Example event", "fields":[{ "name": "version", "type": "string", "default": "1" }, { "name": "id", "type": "string", "default": "Unknown"}, {"name": "time","type": "long","default": -1}, {"name": "body","type": "string","default": "no body"}, {"name": "color","type": { "type": "enum", "name": "Color", "symbols": ["NO_COLOR", "BLUE", "BLACK", "WHITE", "PINK"] }, "default": "NO_COLOR" } ] }
  • 10. Avro IDL - LivePerson Event /** Base for all LivePerson Events */ @namespace("com.liveperson.global") record LPEvent { /** Common Header of the event */ CommonHeader header = null; /** Logical entity details participating in this event - Visitor, Agent, etc... */ array<Participant> participants = null; /** Holding specific platform info as node name (machine) cluster Id etc... */ PlatformHeader platformSpecificHeader = null; /** Auditing Header, Optional - adds data for auditing of the events flow in the platform*/ union {null, AuditingHeader } auditingHeader = null; /** The event body */ EventBody eventBody = null; }
  • 11. Backward & Forward Compatibility Avro schema evolution ● Avro supports two schemes resolution ● Need to follow a set of rules: ● Every field must have a default value ● A field can be added (make sure to put a default value) ● Field types can not be changed (add a new field instead) ● enum symbols can be added but never removed
  • 13. How good does it work? ● Cyber Monday 2013 (one day) o More than 320,000 events per second o 7 Storm topologies consuming the events seconds from real time o 2TB of data saved to Hadoop ● 2014 preparation: o x2 number of events per second to ~640,000
  • 14. So how did we do it? 1. Use an event driven system, don’t use direct APIs 2. Create a unified schema for all events 3. Use Avro to implement the schema 4. Add some supporting infrastructure
  • 16. Amihay Zer-Kavod You can contact me at: amihayz@liveperson.com LivePerson is hiring!