SlideShare a Scribd company logo
1 of 15
Leveraging Data:
Building a Stable Platform
Ophir Cohen, Data Platform Lead, ophirc@liveperson.com
Amit Fainer, Data QA Lead, amitfa@liveperson.com
May, 2013
Connection before content… 2
 Who was the commander of whom in the army?
 Who met his wife in India?
Agenda 3
 Connection before content
 LivePerson Is…
 Data platform requirements
 Quality challenges
 Architecture
 Development and production processes
 Case study: LivePerson BI Reports
LivePerson Is…
Mission:
4
Company
• Cloud-computing, SaaS pioneer since 1998
• IPO April 2000 (Nasdaq: LPSN); debt free
• 700+ employees
• LivePerson offers an extensive and rapidly-growing partner network
Customers
• 8,500 customers around the globe have chosen LivePerson to create secure,
reliable connections with their customers. LivePerson clients include:
• 8 of the top 10 Fortune 500 companies
•Top 10 of 15 commercial banks (Fortune 500)
•Top 4 of 5 telecommunication companies (Fortune 500)
•4 of the top 7 of the Forbes Global 2000
•5 of the top 6 software and services companies (Forbes 2000)
•8 of the top 10 of Interbrand's Best Global Brands
Service Delivery
• 1.8 billion visitors monitored per month
• 20 million connections per month
• Analyzes over 1.2 million documents and chat transcripts per month.
Mission
Creating
Meaningful
Customer
Connections
Live Chat and Click-to-Call
Vendor 2012
Enterprise Customer Success & Domain Expertise
Finance
High–Tech
Retail
Telecom
Travel
5
Requirements 6
 Massive Data flow (few TB a day)
 Different Data types, Different Producers
 Never Lose Data!
 Variety latency needs – Near real-time through Offline
 Data is accessible to everyone for Processing, in a standardized,
common paradigm, adopted by all consumers and producers
Quality Challenges 7
 Large volumes of Data – Automate or Die
 Bugs yield corrupted Data
 Produced data stays Forever
 Consumers need a standardized form to assure data integrity
Architecture 8
Kafka
Data Tier
Application Tier
Storm
Hadoop
Pig
Java MR
Hive
Architecture – Persistency Layer 9
Kafka
Data Tier
Application Tier
Storm
Hadoop
Pig
Java MR
Hive
Kafka (by LinkedIn):
• Queuing mechanism
• Persistency layer
• High availability layer
Architecture – Streaming Processing Layer 10
Kafka
Data Tier
Application Tier
Storm
Hadoop
Pig
Java MR
Hive
Storm (by Twitter)
• Stream processing
• Pluggable framework
Architecture – Batch Processing Layer 11
Kafka
Data Tier
Application Tier
Storm
Hadoop
Pig
Java MR
Hive
Hadoop (an Apache Project)
• Reliable, scalable, distributed
computing framework
• Rich eco-system
Develop, Test and Deploy at Scale 12
 Automated, Continuously integrated with built-in Performance
testing
 Satisfying Monitoring and Auditing needs of Tiers 1 through 5
 On going production tests
 Auditing mechanism
 Scrum
 Isolated production-mirrored environment for Testing
Case Study – LivePerson BI Reports 13
Case Study – LivePerson BI Reports 14
 Source to target
 Auditing tool as part of data integrity tests
 Load tests in real data env
Thank You 15
LivePerson Hire!
Feel free to reach out:
 ophirc@liveperson.com
 @ophchu
 amitfa@liveperson.com

More Related Content

More from Taldor Group

פיני מנדל תובנות עסקיות מיישומי Hadoop
פיני מנדל   תובנות עסקיות מיישומי Hadoopפיני מנדל   תובנות עסקיות מיישומי Hadoop
פיני מנדל תובנות עסקיות מיישומי HadoopTaldor Group
 
נתן פרידחי הקדמה לכנס Hadoop
נתן פרידחי   הקדמה לכנס Hadoopנתן פרידחי   הקדמה לכנס Hadoop
נתן פרידחי הקדמה לכנס HadoopTaldor Group
 
הערך העסקי שבאיכות הנתונים קוסטין מרזאה
הערך העסקי שבאיכות הנתונים   קוסטין מרזאההערך העסקי שבאיכות הנתונים   קוסטין מרזאה
הערך העסקי שבאיכות הנתונים קוסטין מרזאהTaldor Group
 
Dcl צביקה מנלה - סיפורי לקוחות
Dcl   צביקה מנלה - סיפורי לקוחותDcl   צביקה מנלה - סיפורי לקוחות
Dcl צביקה מנלה - סיפורי לקוחותTaldor Group
 
Taldor data quality einat shimoni - stki
Taldor data quality   einat shimoni - stkiTaldor data quality   einat shimoni - stki
Taldor data quality einat shimoni - stkiTaldor Group
 
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 32013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3Taldor Group
 
Loshin operationalizingdatagovernance
Loshin operationalizingdatagovernanceLoshin operationalizingdatagovernance
Loshin operationalizingdatagovernanceTaldor Group
 

More from Taldor Group (7)

פיני מנדל תובנות עסקיות מיישומי Hadoop
פיני מנדל   תובנות עסקיות מיישומי Hadoopפיני מנדל   תובנות עסקיות מיישומי Hadoop
פיני מנדל תובנות עסקיות מיישומי Hadoop
 
נתן פרידחי הקדמה לכנס Hadoop
נתן פרידחי   הקדמה לכנס Hadoopנתן פרידחי   הקדמה לכנס Hadoop
נתן פרידחי הקדמה לכנס Hadoop
 
הערך העסקי שבאיכות הנתונים קוסטין מרזאה
הערך העסקי שבאיכות הנתונים   קוסטין מרזאההערך העסקי שבאיכות הנתונים   קוסטין מרזאה
הערך העסקי שבאיכות הנתונים קוסטין מרזאה
 
Dcl צביקה מנלה - סיפורי לקוחות
Dcl   צביקה מנלה - סיפורי לקוחותDcl   צביקה מנלה - סיפורי לקוחות
Dcl צביקה מנלה - סיפורי לקוחות
 
Taldor data quality einat shimoni - stki
Taldor data quality   einat shimoni - stkiTaldor data quality   einat shimoni - stki
Taldor data quality einat shimoni - stki
 
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 32013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
2013 04 irm mdmdg - jon asprey 4 most asked dg questions v 1 3
 
Loshin operationalizingdatagovernance
Loshin operationalizingdatagovernanceLoshin operationalizingdatagovernance
Loshin operationalizingdatagovernance
 

Recently uploaded

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 

Live person under_the_hood_taldor_for_publish

  • 1. Leveraging Data: Building a Stable Platform Ophir Cohen, Data Platform Lead, ophirc@liveperson.com Amit Fainer, Data QA Lead, amitfa@liveperson.com May, 2013
  • 2. Connection before content… 2  Who was the commander of whom in the army?  Who met his wife in India?
  • 3. Agenda 3  Connection before content  LivePerson Is…  Data platform requirements  Quality challenges  Architecture  Development and production processes  Case study: LivePerson BI Reports
  • 4. LivePerson Is… Mission: 4 Company • Cloud-computing, SaaS pioneer since 1998 • IPO April 2000 (Nasdaq: LPSN); debt free • 700+ employees • LivePerson offers an extensive and rapidly-growing partner network Customers • 8,500 customers around the globe have chosen LivePerson to create secure, reliable connections with their customers. LivePerson clients include: • 8 of the top 10 Fortune 500 companies •Top 10 of 15 commercial banks (Fortune 500) •Top 4 of 5 telecommunication companies (Fortune 500) •4 of the top 7 of the Forbes Global 2000 •5 of the top 6 software and services companies (Forbes 2000) •8 of the top 10 of Interbrand's Best Global Brands Service Delivery • 1.8 billion visitors monitored per month • 20 million connections per month • Analyzes over 1.2 million documents and chat transcripts per month. Mission Creating Meaningful Customer Connections Live Chat and Click-to-Call Vendor 2012
  • 5. Enterprise Customer Success & Domain Expertise Finance High–Tech Retail Telecom Travel 5
  • 6. Requirements 6  Massive Data flow (few TB a day)  Different Data types, Different Producers  Never Lose Data!  Variety latency needs – Near real-time through Offline  Data is accessible to everyone for Processing, in a standardized, common paradigm, adopted by all consumers and producers
  • 7. Quality Challenges 7  Large volumes of Data – Automate or Die  Bugs yield corrupted Data  Produced data stays Forever  Consumers need a standardized form to assure data integrity
  • 8. Architecture 8 Kafka Data Tier Application Tier Storm Hadoop Pig Java MR Hive
  • 9. Architecture – Persistency Layer 9 Kafka Data Tier Application Tier Storm Hadoop Pig Java MR Hive Kafka (by LinkedIn): • Queuing mechanism • Persistency layer • High availability layer
  • 10. Architecture – Streaming Processing Layer 10 Kafka Data Tier Application Tier Storm Hadoop Pig Java MR Hive Storm (by Twitter) • Stream processing • Pluggable framework
  • 11. Architecture – Batch Processing Layer 11 Kafka Data Tier Application Tier Storm Hadoop Pig Java MR Hive Hadoop (an Apache Project) • Reliable, scalable, distributed computing framework • Rich eco-system
  • 12. Develop, Test and Deploy at Scale 12  Automated, Continuously integrated with built-in Performance testing  Satisfying Monitoring and Auditing needs of Tiers 1 through 5  On going production tests  Auditing mechanism  Scrum  Isolated production-mirrored environment for Testing
  • 13. Case Study – LivePerson BI Reports 13
  • 14. Case Study – LivePerson BI Reports 14  Source to target  Auditing tool as part of data integrity tests  Load tests in real data env
  • 15. Thank You 15 LivePerson Hire! Feel free to reach out:  ophirc@liveperson.com  @ophchu  amitfa@liveperson.com

Editor's Notes

  1. We need to update this slide
  2. The biggest in the areaAll fields: finance, telecom etc…