SlideShare une entreprise Scribd logo
1  sur  50
Yaar Reuveni & Nir Hedvat
Becoming a Proactive Data
Platform
Yaar Reuveni
• 6 Years at Liveperson
• 1 Reporting & BI
• 3 Data Platform
• 2 Data Platform team lead
• I love to travel
• And
Nir Hedvat
• Software Engineer B.Sc
• 3 years as a C++ Developer
at IBM Rational Rhapsody™
• 1.5 years at LivePerson
• Cloud and Parallel Computing
Enthusiast
• Love Math and Powerlifting
Agenda
• Our Scale & Operation
• Evolution in becoming proactive
i. Hope & Low awareness
ii. Storming & Troubleshooting
iii. Fortifying
iv. Internalization & Comprehension
v. Being Proactive
• Showcases
• Implementation
Our Scale
• 2 M Daily chats
• 100 M Daily monitored visitor sessions
• 20 B Events per day
• 2 TB Raw data per day
• 2 PB Total in Hadoop clusters
• Hundreds producers * event types * consumers
LivePerson technology stack
Stage 1: Hope & Low awareness
We built it and it’s awesome
Online
producer
Offline
producer
local
files
DSPT
Jobs
Raw
Data
* DSPT - Data single point of truth
Stage 1: Hope & Low awareness
We’ve got customers
Dashboards
Data Science
Apps
Reporting
Data ScienceData Access
Ad-Hoc
Queries
Stage 2: Storming & Troubleshooting
You’ve got NOC & SCS on speed dial
Issues arise:
• Data loss
• Data delays
• Partial data out of frame
• Missing/faulty calculations for consumers
• One producer does not send for over a week
Stage 2: Storming & Troubleshooting
You’ve got NOC & SCS on speed dial
Common issues types and generators:
• Hadoop ops
• Production ops
• Events schema
• New data producers
• High new features rate (LE2.0)
• Data stuck in pipeline
• Bugs
Stage 3: Fortifying
Every interruption derives a new protection
Stage 3: Fortifying
Every interruption derives a new protection
Stage 3: Fortifying
Every interruption derives a new protection
• Monitors on jobs, failures, success rate
• Monitors on service status
• Simple data freshness checks e.g. measure the
newest event
• Measure latency of specific parts of the pipeline
Stage 4: Internalization & Comprehension
Auditing requirements
• Measure principles:
– Loss
• How much?
• Which customer?
• What Type?
• Where in the pipeline?
– Freshness
• Percentiles
• Trends
– Statistics
• Event type count
• Event per LP customer
• Trends
Producer
Audit DB
Audit
Aggregator
Audit
Loader
Stage 4: Internalization & Comprehension
Auditing architecture
Producer
Producer
Events Audit
Events
Control
Freshness
Stage 4: Internalization & Comprehension
Mechanism
Data
Common Header
Audit Header
1. Enrich events with
audit metadata
Control Event -
Audit aggregation
Common Header
Audit Header
2. Send control
events per x minutes
Stage 4: Internalization & Comprehension
Mechanism
Data
Common Header
Data
Common Header
Data
Common Header
Data
Common Header
Data
Common Header
Data
Common Header
Audit Header
Control Event - Audit
aggregation
Common Header
Audit Header
Control Event - Audit
aggregation
Common Header
Audit Header
Data
Common Header
Audit Header
Data
Common Header
Audit Header
Data
Common Header
Audit Header
Data
Common Header
Audit Header
Data
Common Header
Old Data Flow
Audited Data Flow
Stage 4: Internalization & Comprehension
How to measure loss?
• Tag all events going through our API with an
auditing header:
<host_name>:<bulk_id>:<sequence_id>
When:
• host_name - the logical identification of the producer server
• bulk_id - an arbitrary unique number that should identify a bulk (changes every X
minutes)
• sequence_id - auto incremented persistent number used to identify missing bulks
• Every X minutes send an audit control event:
{
eventType: AuditControlEvent,
Bulks: [{bulk_id:“srv-xyz:111:97”, data_tier:”shark producer”, total_count:785},
{bulk_id:“srv-xyz:112:98”, data_tier:”shark producer”, total_count:1715}]
}
Stage 4: Internalization & Comprehension
What’s next?
• Immediate gain: enables research loss straight
on the raw data
Next:
• Count events per auditing bulk
• Load into some DB for dashboarding:
In this example, assuming you look at the table after 11:34, and we refer to more than 3 hours as loss, we can see that from server
srv-xyz at bulk_id 1a2b3c we can see 750 events were created and only 405+250 = 655 events arrived within 3 hours this means
we can detect a loss of 95 events from this server.
Audit metadata Data Tier Insertion time Events count
srv-xyz:1a2b3c:25 Producer 08:34 750
srv-xyz:1a2b3c:25 HDFS 09:05 405
srv-xyz:1a2b3c:25 HDFS 10:13 250
Stage 4: Internalization & Comprehension
How to measure freshness?
• Run incremental on the raw data
• Group events by
– Total
– Event type
– LP customer
• Per event calculate
Insertion time - creation time
• Per group:
– Total count
– Min, max & average
– Count into time buckets (0-30; 30-60; 60-120; 120-∞)
Stage 5: Being Proactive
Tools - loss dashboard
Stage 5: Being Proactive
Tools - loss detailed dashboard
Stage 5: Being Proactive
Tools - loss trends
Stage 5: Being Proactive
Tools - freshness
Stage 5: Being Proactive
Tools - freshness
Stage 5: Being Proactive
Tools - data statistics
Showcase I
Bug in a new producer
Showcase II
Deployment issue
• Constant loss
• Only in one farm
• Depends on traffic
• Only a specific producer type
• From all of its nodes
Showcase III
Consumer jobs issues
• Our auditing detected a loss in Alpha
• Data stuck in a job failure dir
• Functional monitoring missed it
• We streamed the stuck data
Showcase IV
Producer issues
• Offline producer gets stuck
• Functional monitoring misses
Implementation
Auditing architecture
Producer
Audit DB
Audit
Aggregator
Audit
Loader
Producer
Producer
Events Audit
Events
Control
Freshness
Implementation
Auditing architecture
Producer
Audit DB
Audit
Aggregator
Audit
Loader
Producer
Producer
Events Audit
Events
Control
Freshness
• Storm topology
• Load audit events from Kafka to MySql
Bulk Tier TS Count
xyz:123 WRPA 08:34 750
xyz:123 DSPT 09:05 405
xyz:123 DSPT 10:13 250
Implementation
Audit Loader
Audit DB
Audit
Loader
Audit
Events
Implementation
Auditing architecture
Producer
Audit DB
Audit
Aggregator
Audit
Loader
Producer
Producer
Events Audit
Events
Control
Freshness
• Load data from HDFS
• Aggregate events according to audit metadata
• Save aggregated audit data to MySql
• Spark implementation
Implementation
Audit Aggregator
HDFS
DB
Data
Aggregate
#1 #2 #3
∑ #1 = N1 ∑ #2 = N2 ∑ #3 = N3
Collect & Save
ZooKeeper
Offset
Audit Aggregator job
First Generation
• Our jobs work incrementally or manually
• Offset management by ZooKeeper
• Failing during saving stage leads to lost offset
• Saving data and offset on same stream
Audit Aggregator job
Overcoming Pitfalls
Audit Aggregator job
Revised Design
HDFS
DB
Aggregate
#1 #2 #3
∑ #1 = N1 ∑ #2 = N2 ∑ #3 = N3
Collect & Save
Data
Offset
Bulk Tier TS Count
xyz:123 WRPA 08:34 750
xyz:123 DSPT 09:05 405
xyz:123 DSPT 10:13 250
• Precedent - Spark Streaming for online auditing
• We see our future with Spark
• Cluster utilization
• Performance
– In-memory computation
– Supports multiple shuffles
– Unified data processing: batch/streaming
Audit Aggregator job
Why Spark
Implementation
Auditing architecture
Producer
Audit DB
Audit
Aggregator
Audit
Loader
Producer
Producer
Events Audit
Events
Control
Freshness
• End-to-end latency assessment
• Freshness per criteria
• Output - various stats
Implementation
Data Freshness
Freshness job
Design
Map
Reduce
HDFS
Total LP Customer Event Type
Min Max Avg BucketsCount
Event
Event
Event
Event
Freshness job
Mechanism
• Driver
– Collects LP events from HDFS
• Map
– Compute freshness latencies
– Segmentize events per criteria by generating
a composite kay
• Reduce
– Compute count, min, max, avg and buckets
– Write stats to HDFS
Freshness job
Output usage
Hadoop Platform
Overcoming Pitfalls
• Our data model is built over Avro
• Avro comes with schema evolution
• Avro data is stored along with its schema
• High model-modification rate
• LOBs schema changes are synchronized
Producer → Consumer
Hadoop Platform
Overcoming Pitfalls
• MR/Spark job is revision-compiled when using
SpecificRecord
• Using GenericRecord removes the burden of
recompiling each time schema changes
Implementation
Auditing architecture
Producer
Audit DB
Audit
Aggregator
Audit
Loader
Producer
Producer
Events Audit
Events
Control
Freshness
THANK YOU!
We are hiring
YouTube.com/LivePersonDev
Twitter.com/LivePersonDev
Facebook.com/LivePersonDev
Slideshare.net/LivePersonDev

Contenu connexe

Tendances

Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...confluent
 
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...confluent
 
Monitoring Serverless Applications with Datadog
Monitoring Serverless Applications with DatadogMonitoring Serverless Applications with Datadog
Monitoring Serverless Applications with DatadogDevOps.com
 
Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...
Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...
Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...Flink Forward
 
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...HostedbyConfluent
 
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward
 
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...confluent
 
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...confluent
 
Data governance and discoverability at AO.com | Jon Vines, AO.com and Christo...
Data governance and discoverability at AO.com | Jon Vines, AO.com and Christo...Data governance and discoverability at AO.com | Jon Vines, AO.com and Christo...
Data governance and discoverability at AO.com | Jon Vines, AO.com and Christo...HostedbyConfluent
 
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...confluent
 
Jun Rao, Confluent | Kafka Summit SF 2019 Keynote ft. Chris Kasten, Walmart Labs
Jun Rao, Confluent | Kafka Summit SF 2019 Keynote ft. Chris Kasten, Walmart LabsJun Rao, Confluent | Kafka Summit SF 2019 Keynote ft. Chris Kasten, Walmart Labs
Jun Rao, Confluent | Kafka Summit SF 2019 Keynote ft. Chris Kasten, Walmart Labsconfluent
 
Cisco’s E-Commerce Transformation Using Kafka
Cisco’s E-Commerce Transformation Using Kafka Cisco’s E-Commerce Transformation Using Kafka
Cisco’s E-Commerce Transformation Using Kafka confluent
 
Events Everywhere: Enabling Digital Transformation in the Public Sector
Events Everywhere: Enabling Digital Transformation in the Public SectorEvents Everywhere: Enabling Digital Transformation in the Public Sector
Events Everywhere: Enabling Digital Transformation in the Public Sectorconfluent
 
Achieving end-to-end visibility into complex event-sourcing transactions usin...
Achieving end-to-end visibility into complex event-sourcing transactions usin...Achieving end-to-end visibility into complex event-sourcing transactions usin...
Achieving end-to-end visibility into complex event-sourcing transactions usin...HostedbyConfluent
 
Should we manage events like APIs? | Kim Clark, IBM
Should we manage events like APIs? | Kim Clark, IBMShould we manage events like APIs? | Kim Clark, IBM
Should we manage events like APIs? | Kim Clark, IBMHostedbyConfluent
 
Matching the Scale at Tinder with Kafka
Matching the Scale at Tinder with Kafka Matching the Scale at Tinder with Kafka
Matching the Scale at Tinder with Kafka confluent
 
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...HostedbyConfluent
 
Building Value - Understanding the TCO and ROI of Apache Kafka & Confluent
Building Value  - Understanding the TCO and ROI of Apache Kafka & ConfluentBuilding Value  - Understanding the TCO and ROI of Apache Kafka & Confluent
Building Value - Understanding the TCO and ROI of Apache Kafka & Confluentconfluent
 
Westpac Bank Tech Talk 2: Introduction to Streaming Data and Stream Processin...
Westpac Bank Tech Talk 2: Introduction to Streaming Data and Stream Processin...Westpac Bank Tech Talk 2: Introduction to Streaming Data and Stream Processin...
Westpac Bank Tech Talk 2: Introduction to Streaming Data and Stream Processin...confluent
 
Blockchain and Kafka - A Modern Love Story | Suhavi Sandhu, Guidewire Software
Blockchain and Kafka - A Modern Love Story | Suhavi Sandhu, Guidewire SoftwareBlockchain and Kafka - A Modern Love Story | Suhavi Sandhu, Guidewire Software
Blockchain and Kafka - A Modern Love Story | Suhavi Sandhu, Guidewire SoftwareHostedbyConfluent
 

Tendances (20)

Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
 
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
 
Monitoring Serverless Applications with Datadog
Monitoring Serverless Applications with DatadogMonitoring Serverless Applications with Datadog
Monitoring Serverless Applications with Datadog
 
Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...
Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...
Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...
 
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
Testing Event Driven Architectures: How to Broker the Complexity | Frank Kilc...
 
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
 
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
Kafka Summit SF 2017 - Worldwide Scalable and Resilient Messaging Services wi...
 
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
Jay Kreps, Confluent | Kafka Summit SF 2019 Keynote ft. Dev Tagare, Lyft + Pr...
 
Data governance and discoverability at AO.com | Jon Vines, AO.com and Christo...
Data governance and discoverability at AO.com | Jon Vines, AO.com and Christo...Data governance and discoverability at AO.com | Jon Vines, AO.com and Christo...
Data governance and discoverability at AO.com | Jon Vines, AO.com and Christo...
 
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
Bank of China Tech Talk 2: Introduction to Streaming Data and Stream Processi...
 
Jun Rao, Confluent | Kafka Summit SF 2019 Keynote ft. Chris Kasten, Walmart Labs
Jun Rao, Confluent | Kafka Summit SF 2019 Keynote ft. Chris Kasten, Walmart LabsJun Rao, Confluent | Kafka Summit SF 2019 Keynote ft. Chris Kasten, Walmart Labs
Jun Rao, Confluent | Kafka Summit SF 2019 Keynote ft. Chris Kasten, Walmart Labs
 
Cisco’s E-Commerce Transformation Using Kafka
Cisco’s E-Commerce Transformation Using Kafka Cisco’s E-Commerce Transformation Using Kafka
Cisco’s E-Commerce Transformation Using Kafka
 
Events Everywhere: Enabling Digital Transformation in the Public Sector
Events Everywhere: Enabling Digital Transformation in the Public SectorEvents Everywhere: Enabling Digital Transformation in the Public Sector
Events Everywhere: Enabling Digital Transformation in the Public Sector
 
Achieving end-to-end visibility into complex event-sourcing transactions usin...
Achieving end-to-end visibility into complex event-sourcing transactions usin...Achieving end-to-end visibility into complex event-sourcing transactions usin...
Achieving end-to-end visibility into complex event-sourcing transactions usin...
 
Should we manage events like APIs? | Kim Clark, IBM
Should we manage events like APIs? | Kim Clark, IBMShould we manage events like APIs? | Kim Clark, IBM
Should we manage events like APIs? | Kim Clark, IBM
 
Matching the Scale at Tinder with Kafka
Matching the Scale at Tinder with Kafka Matching the Scale at Tinder with Kafka
Matching the Scale at Tinder with Kafka
 
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
 
Building Value - Understanding the TCO and ROI of Apache Kafka & Confluent
Building Value  - Understanding the TCO and ROI of Apache Kafka & ConfluentBuilding Value  - Understanding the TCO and ROI of Apache Kafka & Confluent
Building Value - Understanding the TCO and ROI of Apache Kafka & Confluent
 
Westpac Bank Tech Talk 2: Introduction to Streaming Data and Stream Processin...
Westpac Bank Tech Talk 2: Introduction to Streaming Data and Stream Processin...Westpac Bank Tech Talk 2: Introduction to Streaming Data and Stream Processin...
Westpac Bank Tech Talk 2: Introduction to Streaming Data and Stream Processin...
 
Blockchain and Kafka - A Modern Love Story | Suhavi Sandhu, Guidewire Software
Blockchain and Kafka - A Modern Love Story | Suhavi Sandhu, Guidewire SoftwareBlockchain and Kafka - A Modern Love Story | Suhavi Sandhu, Guidewire Software
Blockchain and Kafka - A Modern Love Story | Suhavi Sandhu, Guidewire Software
 

En vedette

Kubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platformKubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platformLivePerson
 
Measure() or die()
Measure() or die() Measure() or die()
Measure() or die() LivePerson
 
Functional programming with Java 8
Functional programming with Java 8Functional programming with Java 8
Functional programming with Java 8LivePerson
 
Continuous Testing Meets the Classroom at Code.org
Continuous Testing Meets the Classroom at Code.orgContinuous Testing Meets the Classroom at Code.org
Continuous Testing Meets the Classroom at Code.orgSauce Labs
 
Pivotal Failure - Lessons Learned from Lean Startup Machine DC
Pivotal Failure - Lessons Learned from Lean Startup Machine DCPivotal Failure - Lessons Learned from Lean Startup Machine DC
Pivotal Failure - Lessons Learned from Lean Startup Machine DCDave Haeffner
 
Web testing with Selenium
Web testing with SeleniumWeb testing with Selenium
Web testing with SeleniumXBOSoft
 
Agile testing for mere mortals
Agile testing for mere mortalsAgile testing for mere mortals
Agile testing for mere mortalsDave Haeffner
 
How To Use Selenium Successfully
How To Use Selenium SuccessfullyHow To Use Selenium Successfully
How To Use Selenium SuccessfullyDave Haeffner
 
Full Stack Testing Done Well
Full Stack Testing Done WellFull Stack Testing Done Well
Full Stack Testing Done WellDave Haeffner
 
Web ui tests examples with selenide, nselene, selene & capybara
Web ui tests examples with  selenide, nselene, selene & capybaraWeb ui tests examples with  selenide, nselene, selene & capybara
Web ui tests examples with selenide, nselene, selene & capybaraIakiv Kramarenko
 
You do not need automation engineer - Sqa Days - 2015 - EN
You do not need automation engineer  - Sqa Days - 2015 - ENYou do not need automation engineer  - Sqa Days - 2015 - EN
You do not need automation engineer - Sqa Days - 2015 - ENIakiv Kramarenko
 
Cross Platform Appium Tests: How To
Cross Platform Appium Tests: How ToCross Platform Appium Tests: How To
Cross Platform Appium Tests: How ToGlobalLogic Ukraine
 
Polyglot automation - QA Fest - 2015
Polyglot automation - QA Fest - 2015Polyglot automation - QA Fest - 2015
Polyglot automation - QA Fest - 2015Iakiv Kramarenko
 
Getting Started with Selenium
Getting Started with SeleniumGetting Started with Selenium
Getting Started with SeleniumDave Haeffner
 
Three Simple Chords of Alternative PageObjects and Hardcore of LoadableCompon...
Three Simple Chords of Alternative PageObjects and Hardcore of LoadableCompon...Three Simple Chords of Alternative PageObjects and Hardcore of LoadableCompon...
Three Simple Chords of Alternative PageObjects and Hardcore of LoadableCompon...Iakiv Kramarenko
 

En vedette (20)

Kubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platformKubernetes your tests! automation with docker on google cloud platform
Kubernetes your tests! automation with docker on google cloud platform
 
Measure() or die()
Measure() or die() Measure() or die()
Measure() or die()
 
Functional programming with Java 8
Functional programming with Java 8Functional programming with Java 8
Functional programming with Java 8
 
Continuous Testing Meets the Classroom at Code.org
Continuous Testing Meets the Classroom at Code.orgContinuous Testing Meets the Classroom at Code.org
Continuous Testing Meets the Classroom at Code.org
 
Pivotal Failure - Lessons Learned from Lean Startup Machine DC
Pivotal Failure - Lessons Learned from Lean Startup Machine DCPivotal Failure - Lessons Learned from Lean Startup Machine DC
Pivotal Failure - Lessons Learned from Lean Startup Machine DC
 
Selenium
SeleniumSelenium
Selenium
 
Web testing with Selenium
Web testing with SeleniumWeb testing with Selenium
Web testing with Selenium
 
The Testable Web
The Testable WebThe Testable Web
The Testable Web
 
Agile testing for mere mortals
Agile testing for mere mortalsAgile testing for mere mortals
Agile testing for mere mortals
 
How To Use Selenium Successfully
How To Use Selenium SuccessfullyHow To Use Selenium Successfully
How To Use Selenium Successfully
 
KISS Automation.py
KISS Automation.pyKISS Automation.py
KISS Automation.py
 
Full Stack Testing Done Well
Full Stack Testing Done WellFull Stack Testing Done Well
Full Stack Testing Done Well
 
Selenium Basics
Selenium BasicsSelenium Basics
Selenium Basics
 
Web ui tests examples with selenide, nselene, selene & capybara
Web ui tests examples with  selenide, nselene, selene & capybaraWeb ui tests examples with  selenide, nselene, selene & capybara
Web ui tests examples with selenide, nselene, selene & capybara
 
You do not need automation engineer - Sqa Days - 2015 - EN
You do not need automation engineer  - Sqa Days - 2015 - ENYou do not need automation engineer  - Sqa Days - 2015 - EN
You do not need automation engineer - Sqa Days - 2015 - EN
 
Cross Platform Appium Tests: How To
Cross Platform Appium Tests: How ToCross Platform Appium Tests: How To
Cross Platform Appium Tests: How To
 
Polyglot automation - QA Fest - 2015
Polyglot automation - QA Fest - 2015Polyglot automation - QA Fest - 2015
Polyglot automation - QA Fest - 2015
 
Getting Started with Selenium
Getting Started with SeleniumGetting Started with Selenium
Getting Started with Selenium
 
Three Simple Chords of Alternative PageObjects and Hardcore of LoadableCompon...
Three Simple Chords of Alternative PageObjects and Hardcore of LoadableCompon...Three Simple Chords of Alternative PageObjects and Hardcore of LoadableCompon...
Three Simple Chords of Alternative PageObjects and Hardcore of LoadableCompon...
 
Bdd lessons-learned
Bdd lessons-learnedBdd lessons-learned
Bdd lessons-learned
 

Similaire à Growing into a proactive Data Platform

Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analyticsamesar0
 
Hadoop Summit 2014: Processing Complex Workflows in Advertising Using Hadoop
Hadoop Summit 2014: Processing Complex Workflows in Advertising Using HadoopHadoop Summit 2014: Processing Complex Workflows in Advertising Using Hadoop
Hadoop Summit 2014: Processing Complex Workflows in Advertising Using HadoopBernardo de Seabra
 
Processing Complex Workflows in Advertising using Hadoop
Processing Complex Workflows in Advertising using HadoopProcessing Complex Workflows in Advertising using Hadoop
Processing Complex Workflows in Advertising using HadoopDataWorks Summit
 
Automated Data Synchronization: Data Loader, Data Mirror & Beyond
Automated Data Synchronization: Data Loader, Data Mirror & BeyondAutomated Data Synchronization: Data Loader, Data Mirror & Beyond
Automated Data Synchronization: Data Loader, Data Mirror & BeyondJeremyOtt5
 
Resilient Predictive Data Pipelines (QCon London 2016)
Resilient Predictive Data Pipelines (QCon London 2016)Resilient Predictive Data Pipelines (QCon London 2016)
Resilient Predictive Data Pipelines (QCon London 2016)Sid Anand
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...Flink Forward
 
Simply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event ProcessingSimply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event Processingidan_by
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...Amazon Web Services
 
Mainframe Application Testing both With and Without Live Data
Mainframe Application Testing both With and Without Live DataMainframe Application Testing both With and Without Live Data
Mainframe Application Testing both With and Without Live DataDevOps for Enterprise Systems
 
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...BI Brainz
 
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...InfluxData
 
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...Lucidworks
 
Advanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAdvanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAljoscha Krettek
 
How Totango uses Apache Spark
How Totango uses Apache SparkHow Totango uses Apache Spark
How Totango uses Apache SparkOren Raboy
 
Sybase BAM Overview
Sybase BAM OverviewSybase BAM Overview
Sybase BAM OverviewXu Jiang
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptQingsong Yao
 
Assessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesAssessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesDATAVERSITY
 
Using Data Science for Cybersecurity
Using Data Science for CybersecurityUsing Data Science for Cybersecurity
Using Data Science for CybersecurityVMware Tanzu
 

Similaire à Growing into a proactive Data Platform (20)

Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 
Hadoop Summit 2014: Processing Complex Workflows in Advertising Using Hadoop
Hadoop Summit 2014: Processing Complex Workflows in Advertising Using HadoopHadoop Summit 2014: Processing Complex Workflows in Advertising Using Hadoop
Hadoop Summit 2014: Processing Complex Workflows in Advertising Using Hadoop
 
Processing Complex Workflows in Advertising using Hadoop
Processing Complex Workflows in Advertising using HadoopProcessing Complex Workflows in Advertising using Hadoop
Processing Complex Workflows in Advertising using Hadoop
 
Reliable and Scalable Data Ingestion at Airbnb
Reliable and Scalable Data Ingestion at AirbnbReliable and Scalable Data Ingestion at Airbnb
Reliable and Scalable Data Ingestion at Airbnb
 
Automated Data Synchronization: Data Loader, Data Mirror & Beyond
Automated Data Synchronization: Data Loader, Data Mirror & BeyondAutomated Data Synchronization: Data Loader, Data Mirror & Beyond
Automated Data Synchronization: Data Loader, Data Mirror & Beyond
 
Resilient Predictive Data Pipelines (QCon London 2016)
Resilient Predictive Data Pipelines (QCon London 2016)Resilient Predictive Data Pipelines (QCon London 2016)
Resilient Predictive Data Pipelines (QCon London 2016)
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
 
Simply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event ProcessingSimply Business - Near Real Time Event Processing
Simply Business - Near Real Time Event Processing
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
 
Mainframe Application Testing both With and Without Live Data
Mainframe Application Testing both With and Without Live DataMainframe Application Testing both With and Without Live Data
Mainframe Application Testing both With and Without Live Data
 
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
Analysing and Troubleshooting Performance Issues in SAP BusinessObjects BI Re...
 
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
 
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
 
Advanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAdvanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applications
 
How Totango uses Apache Spark
How Totango uses Apache SparkHow Totango uses Apache Spark
How Totango uses Apache Spark
 
Sybase BAM Overview
Sybase BAM OverviewSybase BAM Overview
Sybase BAM Overview
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Sql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.pptSql azure cluster dashboard public.ppt
Sql azure cluster dashboard public.ppt
 
Assessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use CasesAssessing New Databases– Translytical Use Cases
Assessing New Databases– Translytical Use Cases
 
Using Data Science for Cybersecurity
Using Data Science for CybersecurityUsing Data Science for Cybersecurity
Using Data Science for Cybersecurity
 

Plus de LivePerson

Microservices on top of kafka
Microservices on top of kafkaMicroservices on top of kafka
Microservices on top of kafkaLivePerson
 
Graph QL Introduction
Graph QL IntroductionGraph QL Introduction
Graph QL IntroductionLivePerson
 
System Revolution- How We Did It
System Revolution- How We Did It System Revolution- How We Did It
System Revolution- How We Did It LivePerson
 
Http 2: Should I care?
Http 2: Should I care?Http 2: Should I care?
Http 2: Should I care?LivePerson
 
Mobile app real-time content modifications using websockets
Mobile app real-time content modifications using websocketsMobile app real-time content modifications using websockets
Mobile app real-time content modifications using websocketsLivePerson
 
Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices LivePerson
 
Apache Avro in LivePerson [Hebrew]
Apache Avro in LivePerson [Hebrew]Apache Avro in LivePerson [Hebrew]
Apache Avro in LivePerson [Hebrew]LivePerson
 
Apache Avro and Messaging at Scale in LivePerson
Apache Avro and Messaging at Scale in LivePersonApache Avro and Messaging at Scale in LivePerson
Apache Avro and Messaging at Scale in LivePersonLivePerson
 
Data compression in Modern Application
Data compression in Modern ApplicationData compression in Modern Application
Data compression in Modern ApplicationLivePerson
 
SIP - Introduction to SIP Protocol
SIP - Introduction to SIP ProtocolSIP - Introduction to SIP Protocol
SIP - Introduction to SIP ProtocolLivePerson
 
Scalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceScalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceLivePerson
 
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...LivePerson
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceLivePerson
 
From a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonFrom a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonLivePerson
 
How can A/B testing go wrong?
How can A/B testing go wrong?How can A/B testing go wrong?
How can A/B testing go wrong?LivePerson
 
Introduction to Vertica (Architecture & More)
Introduction to Vertica (Architecture & More)Introduction to Vertica (Architecture & More)
Introduction to Vertica (Architecture & More)LivePerson
 

Plus de LivePerson (16)

Microservices on top of kafka
Microservices on top of kafkaMicroservices on top of kafka
Microservices on top of kafka
 
Graph QL Introduction
Graph QL IntroductionGraph QL Introduction
Graph QL Introduction
 
System Revolution- How We Did It
System Revolution- How We Did It System Revolution- How We Did It
System Revolution- How We Did It
 
Http 2: Should I care?
Http 2: Should I care?Http 2: Should I care?
Http 2: Should I care?
 
Mobile app real-time content modifications using websockets
Mobile app real-time content modifications using websocketsMobile app real-time content modifications using websockets
Mobile app real-time content modifications using websockets
 
Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices Mobile SDK: Considerations & Best Practices
Mobile SDK: Considerations & Best Practices
 
Apache Avro in LivePerson [Hebrew]
Apache Avro in LivePerson [Hebrew]Apache Avro in LivePerson [Hebrew]
Apache Avro in LivePerson [Hebrew]
 
Apache Avro and Messaging at Scale in LivePerson
Apache Avro and Messaging at Scale in LivePersonApache Avro and Messaging at Scale in LivePerson
Apache Avro and Messaging at Scale in LivePerson
 
Data compression in Modern Application
Data compression in Modern ApplicationData compression in Modern Application
Data compression in Modern Application
 
SIP - Introduction to SIP Protocol
SIP - Introduction to SIP ProtocolSIP - Introduction to SIP Protocol
SIP - Introduction to SIP Protocol
 
Scalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduceScalding: Reaching Efficient MapReduce
Scalding: Reaching Efficient MapReduce
 
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...Building Enterprise Level End-To-End Monitor System with Open Source Solution...
Building Enterprise Level End-To-End Monitor System with Open Source Solution...
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
From a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePersonFrom a Kafkaesque Story to The Promised Land at LivePerson
From a Kafkaesque Story to The Promised Land at LivePerson
 
How can A/B testing go wrong?
How can A/B testing go wrong?How can A/B testing go wrong?
How can A/B testing go wrong?
 
Introduction to Vertica (Architecture & More)
Introduction to Vertica (Architecture & More)Introduction to Vertica (Architecture & More)
Introduction to Vertica (Architecture & More)
 

Dernier

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Growing into a proactive Data Platform

  • 1.
  • 2. Yaar Reuveni & Nir Hedvat Becoming a Proactive Data Platform
  • 3. Yaar Reuveni • 6 Years at Liveperson • 1 Reporting & BI • 3 Data Platform • 2 Data Platform team lead • I love to travel • And
  • 4. Nir Hedvat • Software Engineer B.Sc • 3 years as a C++ Developer at IBM Rational Rhapsody™ • 1.5 years at LivePerson • Cloud and Parallel Computing Enthusiast • Love Math and Powerlifting
  • 5. Agenda • Our Scale & Operation • Evolution in becoming proactive i. Hope & Low awareness ii. Storming & Troubleshooting iii. Fortifying iv. Internalization & Comprehension v. Being Proactive • Showcases • Implementation
  • 6. Our Scale • 2 M Daily chats • 100 M Daily monitored visitor sessions • 20 B Events per day • 2 TB Raw data per day • 2 PB Total in Hadoop clusters • Hundreds producers * event types * consumers
  • 8. Stage 1: Hope & Low awareness We built it and it’s awesome Online producer Offline producer local files DSPT Jobs Raw Data * DSPT - Data single point of truth
  • 9. Stage 1: Hope & Low awareness We’ve got customers Dashboards Data Science Apps Reporting Data ScienceData Access Ad-Hoc Queries
  • 10. Stage 2: Storming & Troubleshooting You’ve got NOC & SCS on speed dial Issues arise: • Data loss • Data delays • Partial data out of frame • Missing/faulty calculations for consumers • One producer does not send for over a week
  • 11. Stage 2: Storming & Troubleshooting You’ve got NOC & SCS on speed dial Common issues types and generators: • Hadoop ops • Production ops • Events schema • New data producers • High new features rate (LE2.0) • Data stuck in pipeline • Bugs
  • 12. Stage 3: Fortifying Every interruption derives a new protection
  • 13. Stage 3: Fortifying Every interruption derives a new protection
  • 14. Stage 3: Fortifying Every interruption derives a new protection • Monitors on jobs, failures, success rate • Monitors on service status • Simple data freshness checks e.g. measure the newest event • Measure latency of specific parts of the pipeline
  • 15. Stage 4: Internalization & Comprehension Auditing requirements • Measure principles: – Loss • How much? • Which customer? • What Type? • Where in the pipeline? – Freshness • Percentiles • Trends – Statistics • Event type count • Event per LP customer • Trends
  • 16. Producer Audit DB Audit Aggregator Audit Loader Stage 4: Internalization & Comprehension Auditing architecture Producer Producer Events Audit Events Control Freshness
  • 17. Stage 4: Internalization & Comprehension Mechanism Data Common Header Audit Header 1. Enrich events with audit metadata Control Event - Audit aggregation Common Header Audit Header 2. Send control events per x minutes
  • 18. Stage 4: Internalization & Comprehension Mechanism Data Common Header Data Common Header Data Common Header Data Common Header Data Common Header Data Common Header Audit Header Control Event - Audit aggregation Common Header Audit Header Control Event - Audit aggregation Common Header Audit Header Data Common Header Audit Header Data Common Header Audit Header Data Common Header Audit Header Data Common Header Audit Header Data Common Header Old Data Flow Audited Data Flow
  • 19. Stage 4: Internalization & Comprehension How to measure loss? • Tag all events going through our API with an auditing header: <host_name>:<bulk_id>:<sequence_id> When: • host_name - the logical identification of the producer server • bulk_id - an arbitrary unique number that should identify a bulk (changes every X minutes) • sequence_id - auto incremented persistent number used to identify missing bulks • Every X minutes send an audit control event: { eventType: AuditControlEvent, Bulks: [{bulk_id:“srv-xyz:111:97”, data_tier:”shark producer”, total_count:785}, {bulk_id:“srv-xyz:112:98”, data_tier:”shark producer”, total_count:1715}] }
  • 20. Stage 4: Internalization & Comprehension What’s next? • Immediate gain: enables research loss straight on the raw data Next: • Count events per auditing bulk • Load into some DB for dashboarding: In this example, assuming you look at the table after 11:34, and we refer to more than 3 hours as loss, we can see that from server srv-xyz at bulk_id 1a2b3c we can see 750 events were created and only 405+250 = 655 events arrived within 3 hours this means we can detect a loss of 95 events from this server. Audit metadata Data Tier Insertion time Events count srv-xyz:1a2b3c:25 Producer 08:34 750 srv-xyz:1a2b3c:25 HDFS 09:05 405 srv-xyz:1a2b3c:25 HDFS 10:13 250
  • 21. Stage 4: Internalization & Comprehension How to measure freshness? • Run incremental on the raw data • Group events by – Total – Event type – LP customer • Per event calculate Insertion time - creation time • Per group: – Total count – Min, max & average – Count into time buckets (0-30; 30-60; 60-120; 120-∞)
  • 22. Stage 5: Being Proactive Tools - loss dashboard
  • 23. Stage 5: Being Proactive Tools - loss detailed dashboard
  • 24. Stage 5: Being Proactive Tools - loss trends
  • 25. Stage 5: Being Proactive Tools - freshness
  • 26. Stage 5: Being Proactive Tools - freshness
  • 27. Stage 5: Being Proactive Tools - data statistics
  • 28. Showcase I Bug in a new producer
  • 29. Showcase II Deployment issue • Constant loss • Only in one farm • Depends on traffic • Only a specific producer type • From all of its nodes
  • 30. Showcase III Consumer jobs issues • Our auditing detected a loss in Alpha • Data stuck in a job failure dir • Functional monitoring missed it • We streamed the stuck data
  • 31. Showcase IV Producer issues • Offline producer gets stuck • Functional monitoring misses
  • 34. • Storm topology • Load audit events from Kafka to MySql Bulk Tier TS Count xyz:123 WRPA 08:34 750 xyz:123 DSPT 09:05 405 xyz:123 DSPT 10:13 250 Implementation Audit Loader Audit DB Audit Loader Audit Events
  • 36. • Load data from HDFS • Aggregate events according to audit metadata • Save aggregated audit data to MySql • Spark implementation Implementation Audit Aggregator
  • 37. HDFS DB Data Aggregate #1 #2 #3 ∑ #1 = N1 ∑ #2 = N2 ∑ #3 = N3 Collect & Save ZooKeeper Offset Audit Aggregator job First Generation
  • 38. • Our jobs work incrementally or manually • Offset management by ZooKeeper • Failing during saving stage leads to lost offset • Saving data and offset on same stream Audit Aggregator job Overcoming Pitfalls
  • 39. Audit Aggregator job Revised Design HDFS DB Aggregate #1 #2 #3 ∑ #1 = N1 ∑ #2 = N2 ∑ #3 = N3 Collect & Save Data Offset Bulk Tier TS Count xyz:123 WRPA 08:34 750 xyz:123 DSPT 09:05 405 xyz:123 DSPT 10:13 250
  • 40. • Precedent - Spark Streaming for online auditing • We see our future with Spark • Cluster utilization • Performance – In-memory computation – Supports multiple shuffles – Unified data processing: batch/streaming Audit Aggregator job Why Spark
  • 42. • End-to-end latency assessment • Freshness per criteria • Output - various stats Implementation Data Freshness
  • 43. Freshness job Design Map Reduce HDFS Total LP Customer Event Type Min Max Avg BucketsCount Event Event Event Event
  • 44. Freshness job Mechanism • Driver – Collects LP events from HDFS • Map – Compute freshness latencies – Segmentize events per criteria by generating a composite kay • Reduce – Compute count, min, max, avg and buckets – Write stats to HDFS
  • 46. Hadoop Platform Overcoming Pitfalls • Our data model is built over Avro • Avro comes with schema evolution • Avro data is stored along with its schema • High model-modification rate • LOBs schema changes are synchronized Producer → Consumer
  • 47. Hadoop Platform Overcoming Pitfalls • MR/Spark job is revision-compiled when using SpecificRecord • Using GenericRecord removes the burden of recompiling each time schema changes