SlideShare une entreprise Scribd logo
1  sur  23
NRT Event Processing
Outline
•  Introduction
•  Our Snowplow Setup
•  Example NRT Use Cases
•  Radio Campaign
•  Telephony System
Simply Business
•  Largest UK business insurance provider
•  More than 400.000 policy holders
•  Using BML, tech and data to disrupt the
business insurance market
Data ’n’ Analytics
•  5 Data Engineers
•  3 Business Intelligence Developers
•  3 Data Analysts
•  1 Data Scientist
•  1 Director of Data Science
•  And hiring! :-)
Our Snowplow Setup
Snowplow Setup
Trackers	
   Collector	
   Enrichment	
   Modeling	
   Storage	
  
•  Trackers, collectors and storage are 100% upstream Snowplow
•  Enrichment:
•  Spark apps that use scala-common-enrich as a library
•  We add our own enrichments after the default ones
•  We perform NRT identity stitching and sessionization	
  
•  Modeling: mix of Spark and SQL jobs
•  Storage: Spark apps that use scala-hadoop-shred as a library
Why ?
•  We wanted a near real-time pipeline, but KCL was too rigid:
•  Provision, set up and monitor the machines
•  Configuration is difficult for complex DAGs
•  In contrast, Spark:
•  Once set up, the cluster is a PaaS
•  Allows streaming, batch, ML and graph workloads
•  Allows analysts and data scientists to use Python
Radio Campaign
The Radio Campaign
•  We’re running a radio campaign in Birmingham, Manchester and
London
•  People that get a quote starting from our radio landing pages get
£25 discount
	
  	
  	
  	
  
The Banner
•  The questionnaire to get quotes can be quite long to complete
•  We wanted to reassure our customers that they would get the
discount
•  We wanted to display a banner at the top through all the pages of
the questionnaire
	
  	
  	
  	
  
The Banner
Our Infrastructure
Spark	
  Stream	
  
NRT	
  Enrichment	
  
Scala	
  Stream	
  
Collector	
   Kinesis	
  
MongoDB	
  
Visitor	
  API	
  QuoBng	
  App	
  
HTTP	
  
On average, it takes 2.5s for an event to be available in the Visitor API
Benefits of NRT Snowplow
•  Our quoting app does not need to know about marketing, user
landing pages, etc.
•  Our Mongo table with active sessions’ events becomes a view of our
event log
•  Can be reused for many other use cases: analytics on read!
	
  	
  	
  	
  
Telephony System
Telephony System
•  We have a call center in Northampton with around 200 consultants
•  We used an off-the-shelf telephony system
•  It worked well for a long time, but:
•  Was not very well integrated with our systems
•  Quite rigid, we couldn’t adapt it to all our needs
•  We had daily reports and they contained aggregated data
	
  	
  	
  	
  
Telephony System
•  We decided to replace it with a home grown, Twilio-based solution
•  Components:
•  Contact Strategy Manager
•  Voice Channel Manager
•  Communication is event-based
•  We transform those events into Snowplow’s unstructured
•  Spark Streaming app to insert the events into Redshift every 2min
	
  	
  	
  	
  
The Infrastructure
Spark	
  Stream	
  
NRT	
  Enrichment	
  
Scala	
  Stream	
  
Collector	
   Kinesis	
   Kinesis	
  
RedshiD	
  
Spark	
  Stream	
  
Shredder	
  
Looker	
  
Contact	
  Strategy	
  
Manager	
  
Voice	
  Channel	
  
Manager	
  
Event	
  
Translator	
  
Events
Example call when viewed as sequence of events:
	
  	
  	
  	
  
Benefits of NRT Snowplow
•  Event Sourcing is great for reporting and analytics: ensures that
data quality remains high
•  Team managers now have a NRT view of what teams are doing
•  You can aggregate and drill down on the data as appropriate
•  Leveraging our data platform: Snowplow pipeline, Redshift & Looker
•  Leveraging our existing skills: everyone knows how to use Looker
	
  	
  	
  	
  
Sum Up
The Infrastructure
Spark	
  Stream	
  
NRT	
  Enrichment	
  
Scala	
  Stream	
  
Collector	
   Kinesis	
  
MongoDB	
  
Kinesis	
  
RedshiD	
  
Spark	
  Stream	
  
Shredder	
  
Visitor	
  API	
   Looker	
  ApplicaBons	
  
NRT Benefits
•  We can dynamically alter the website while the user is still using it
•  We can provide insights on live processes
•  Multiple uses to improve conversion:
•  Instant inclusion/exclusion from remarketing lists
•  Abandoned cart emails/calls
•  Social proofing (3 more people are also watching…)
•  …
	
  	
  	
  	
  
Questions?
@dani_sola
dani.sola@simplybusiness.co.uk

Contenu connexe

Tendances

Big data meetup budapest adding data schemas to snowplow
Big data meetup budapest   adding data schemas to snowplowBig data meetup budapest   adding data schemas to snowplow
Big data meetup budapest adding data schemas to snowplow
yalisassoon
 
How we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changingHow we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changing
yalisassoon
 
Viewbix tracking journey
Viewbix tracking journeyViewbix tracking journey
Viewbix tracking journey
idan_by
 

Tendances (20)

Snowplow Analytics: from NoSQL to SQL and back again
Snowplow Analytics: from NoSQL to SQL and back againSnowplow Analytics: from NoSQL to SQL and back again
Snowplow Analytics: from NoSQL to SQL and back again
 
2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling2016 09 measurecamp - event data modeling
2016 09 measurecamp - event data modeling
 
Big data meetup budapest adding data schemas to snowplow
Big data meetup budapest   adding data schemas to snowplowBig data meetup budapest   adding data schemas to snowplow
Big data meetup budapest adding data schemas to snowplow
 
Flows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan StewartFlows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
Flows in the Service Console, Gotta Go with the Flow! by Duncan Stewart
 
Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3Snowplow presentation for Amsterdam Meetup #3
Snowplow presentation for Amsterdam Meetup #3
 
Understanding event data
Understanding event dataUnderstanding event data
Understanding event data
 
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
Analytics at Carbonite: presentation to Snowplow Meetup Boston April 2016
 
Snowplow the evolving data pipeline
Snowplow   the evolving data pipelineSnowplow   the evolving data pipeline
Snowplow the evolving data pipeline
 
Modelling event data in look ml
Modelling event data in look mlModelling event data in look ml
Modelling event data in look ml
 
Snowplow at DA Hub emerging technology showcase
Snowplow at DA Hub emerging technology showcaseSnowplow at DA Hub emerging technology showcase
Snowplow at DA Hub emerging technology showcase
 
How to evolve your analytics stack with your business using Snowplow
How to evolve your analytics stack with your business using SnowplowHow to evolve your analytics stack with your business using Snowplow
How to evolve your analytics stack with your business using Snowplow
 
How we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changingHow we use Hive at SnowPlow, and how the role of HIve is changing
How we use Hive at SnowPlow, and how the role of HIve is changing
 
Memrise presentation @ London Snowplow meetup
Memrise presentation @ London Snowplow meetup Memrise presentation @ London Snowplow meetup
Memrise presentation @ London Snowplow meetup
 
How Gousto is moving to just-in-time personalization with Snowplow
How Gousto is moving to just-in-time personalization with SnowplowHow Gousto is moving to just-in-time personalization with Snowplow
How Gousto is moving to just-in-time personalization with Snowplow
 
Using Snowplow for A/B testing and user journey analysis at CustomMade
Using Snowplow for A/B testing and user journey analysis at CustomMadeUsing Snowplow for A/B testing and user journey analysis at CustomMade
Using Snowplow for A/B testing and user journey analysis at CustomMade
 
Viewbix tracking journey
Viewbix tracking journeyViewbix tracking journey
Viewbix tracking journey
 
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
Tapjoy: Building a Real-Time Data Science Service for Mobile AdvertisingTapjoy: Building a Real-Time Data Science Service for Mobile Advertising
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
 
Cap server log file analytics
Cap server log file analyticsCap server log file analytics
Cap server log file analytics
 
Snowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your businessSnowplow: evolve your analytics stack with your business
Snowplow: evolve your analytics stack with your business
 
Cap intro oct2014 pdf
Cap intro oct2014 pdfCap intro oct2014 pdf
Cap intro oct2014 pdf
 

En vedette

The culture trip snowplow implementation
The culture trip snowplow implementationThe culture trip snowplow implementation
The culture trip snowplow implementation
idan_by
 

En vedette (6)

Snowplow at Sigfig
Snowplow at SigfigSnowplow at Sigfig
Snowplow at Sigfig
 
Introducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabricIntroducing Tupilak, Snowplow's unified log fabric
Introducing Tupilak, Snowplow's unified log fabric
 
How Incuda builds user journey models with Snowplow
How Incuda builds user journey models with SnowplowHow Incuda builds user journey models with Snowplow
How Incuda builds user journey models with Snowplow
 
Data science as a service
Data science as a serviceData science as a service
Data science as a service
 
The culture trip snowplow implementation
The culture trip snowplow implementationThe culture trip snowplow implementation
The culture trip snowplow implementation
 
Streetlife's real time analytics stack
Streetlife's real time analytics stackStreetlife's real time analytics stack
Streetlife's real time analytics stack
 

Similaire à Simply Business - Near Real Time Event Processing

Similaire à Simply Business - Near Real Time Event Processing (20)

Building a [micro]services platform on AWS
Building a [micro]services platform on AWSBuilding a [micro]services platform on AWS
Building a [micro]services platform on AWS
 
From no services to Microservices
From no services to MicroservicesFrom no services to Microservices
From no services to Microservices
 
HOP! Airlines Jets to Real Time
HOP! Airlines Jets to Real TimeHOP! Airlines Jets to Real Time
HOP! Airlines Jets to Real Time
 
SplunkLive! Utrecht 2016 - NXP
SplunkLive! Utrecht 2016 - NXPSplunkLive! Utrecht 2016 - NXP
SplunkLive! Utrecht 2016 - NXP
 
Rakuten’s Journey with Splunk - Evolution of Splunk as a Service
Rakuten’s Journey with Splunk - Evolution of Splunk as a ServiceRakuten’s Journey with Splunk - Evolution of Splunk as a Service
Rakuten’s Journey with Splunk - Evolution of Splunk as a Service
 
Moving to microservices – a technology and organisation transformational journey
Moving to microservices – a technology and organisation transformational journeyMoving to microservices – a technology and organisation transformational journey
Moving to microservices – a technology and organisation transformational journey
 
Correlate Log Data with Business Metrics Like a Jedi
Correlate Log Data with Business Metrics Like a JediCorrelate Log Data with Business Metrics Like a Jedi
Correlate Log Data with Business Metrics Like a Jedi
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
 
Simply Business' Data Platform
Simply Business' Data PlatformSimply Business' Data Platform
Simply Business' Data Platform
 
Architecture for Scale [AppFirst]
Architecture for Scale [AppFirst]Architecture for Scale [AppFirst]
Architecture for Scale [AppFirst]
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 
Best Practices for Scaling an InfluxEnterprise Cluster
Best Practices for Scaling an InfluxEnterprise ClusterBest Practices for Scaling an InfluxEnterprise Cluster
Best Practices for Scaling an InfluxEnterprise Cluster
 
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
MongoDB .local London 2019: Nationwide Building Society: Building Mobile Appl...
 
Suning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatSuning OpenStack Cloud and Heat
Suning OpenStack Cloud and Heat
 
freebeersessions #26 Scaling Up and Out Using Open Source at Netstar
freebeersessions #26 Scaling Up and Out Using Open Source at Netstarfreebeersessions #26 Scaling Up and Out Using Open Source at Netstar
freebeersessions #26 Scaling Up and Out Using Open Source at Netstar
 
Developing multi-functional “sensor” web service platform for citizen sensing
Developing multi-functional “sensor” web service platform for citizen sensingDeveloping multi-functional “sensor” web service platform for citizen sensing
Developing multi-functional “sensor” web service platform for citizen sensing
 
Scaling Your Architecture with Services and Events
Scaling Your Architecture with Services and EventsScaling Your Architecture with Services and Events
Scaling Your Architecture with Services and Events
 
Kinesis @ lyft
Kinesis @ lyftKinesis @ lyft
Kinesis @ lyft
 
Qwasi Splunk and NCR Integration: Business Analytics
Qwasi Splunk and NCR Integration: Business AnalyticsQwasi Splunk and NCR Integration: Business Analytics
Qwasi Splunk and NCR Integration: Business Analytics
 
PlayStation and Lucene - Indexing 1M documents per second: Presented by Alexa...
PlayStation and Lucene - Indexing 1M documents per second: Presented by Alexa...PlayStation and Lucene - Indexing 1M documents per second: Presented by Alexa...
PlayStation and Lucene - Indexing 1M documents per second: Presented by Alexa...
 

Dernier

Dernier (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 

Simply Business - Near Real Time Event Processing

  • 2. Outline •  Introduction •  Our Snowplow Setup •  Example NRT Use Cases •  Radio Campaign •  Telephony System
  • 3. Simply Business •  Largest UK business insurance provider •  More than 400.000 policy holders •  Using BML, tech and data to disrupt the business insurance market
  • 4. Data ’n’ Analytics •  5 Data Engineers •  3 Business Intelligence Developers •  3 Data Analysts •  1 Data Scientist •  1 Director of Data Science •  And hiring! :-)
  • 6. Snowplow Setup Trackers   Collector   Enrichment   Modeling   Storage   •  Trackers, collectors and storage are 100% upstream Snowplow •  Enrichment: •  Spark apps that use scala-common-enrich as a library •  We add our own enrichments after the default ones •  We perform NRT identity stitching and sessionization   •  Modeling: mix of Spark and SQL jobs •  Storage: Spark apps that use scala-hadoop-shred as a library
  • 7. Why ? •  We wanted a near real-time pipeline, but KCL was too rigid: •  Provision, set up and monitor the machines •  Configuration is difficult for complex DAGs •  In contrast, Spark: •  Once set up, the cluster is a PaaS •  Allows streaming, batch, ML and graph workloads •  Allows analysts and data scientists to use Python
  • 9. The Radio Campaign •  We’re running a radio campaign in Birmingham, Manchester and London •  People that get a quote starting from our radio landing pages get £25 discount        
  • 10. The Banner •  The questionnaire to get quotes can be quite long to complete •  We wanted to reassure our customers that they would get the discount •  We wanted to display a banner at the top through all the pages of the questionnaire        
  • 12. Our Infrastructure Spark  Stream   NRT  Enrichment   Scala  Stream   Collector   Kinesis   MongoDB   Visitor  API  QuoBng  App   HTTP   On average, it takes 2.5s for an event to be available in the Visitor API
  • 13. Benefits of NRT Snowplow •  Our quoting app does not need to know about marketing, user landing pages, etc. •  Our Mongo table with active sessions’ events becomes a view of our event log •  Can be reused for many other use cases: analytics on read!        
  • 15. Telephony System •  We have a call center in Northampton with around 200 consultants •  We used an off-the-shelf telephony system •  It worked well for a long time, but: •  Was not very well integrated with our systems •  Quite rigid, we couldn’t adapt it to all our needs •  We had daily reports and they contained aggregated data        
  • 16. Telephony System •  We decided to replace it with a home grown, Twilio-based solution •  Components: •  Contact Strategy Manager •  Voice Channel Manager •  Communication is event-based •  We transform those events into Snowplow’s unstructured •  Spark Streaming app to insert the events into Redshift every 2min        
  • 17. The Infrastructure Spark  Stream   NRT  Enrichment   Scala  Stream   Collector   Kinesis   Kinesis   RedshiD   Spark  Stream   Shredder   Looker   Contact  Strategy   Manager   Voice  Channel   Manager   Event   Translator  
  • 18. Events Example call when viewed as sequence of events:        
  • 19. Benefits of NRT Snowplow •  Event Sourcing is great for reporting and analytics: ensures that data quality remains high •  Team managers now have a NRT view of what teams are doing •  You can aggregate and drill down on the data as appropriate •  Leveraging our data platform: Snowplow pipeline, Redshift & Looker •  Leveraging our existing skills: everyone knows how to use Looker        
  • 21. The Infrastructure Spark  Stream   NRT  Enrichment   Scala  Stream   Collector   Kinesis   MongoDB   Kinesis   RedshiD   Spark  Stream   Shredder   Visitor  API   Looker  ApplicaBons  
  • 22. NRT Benefits •  We can dynamically alter the website while the user is still using it •  We can provide insights on live processes •  Multiple uses to improve conversion: •  Instant inclusion/exclusion from remarketing lists •  Abandoned cart emails/calls •  Social proofing (3 more people are also watching…) •  …