Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Simply Business - Near Real Time Event Processing

2 077 vues

Publié le

A presentation explaining the Simply Business analytics stack and how they use Snowplow analytics to track near real time events

Publié dans : Technologie
  • Identifiez-vous pour voir les commentaires

Simply Business - Near Real Time Event Processing

  1. 1. NRT Event Processing
  2. 2. Outline •  Introduction •  Our Snowplow Setup •  Example NRT Use Cases •  Radio Campaign •  Telephony System
  3. 3. Simply Business •  Largest UK business insurance provider •  More than 400.000 policy holders •  Using BML, tech and data to disrupt the business insurance market
  4. 4. Data ’n’ Analytics •  5 Data Engineers •  3 Business Intelligence Developers •  3 Data Analysts •  1 Data Scientist •  1 Director of Data Science •  And hiring! :-)
  5. 5. Our Snowplow Setup
  6. 6. Snowplow Setup Trackers   Collector   Enrichment   Modeling   Storage   •  Trackers, collectors and storage are 100% upstream Snowplow •  Enrichment: •  Spark apps that use scala-common-enrich as a library •  We add our own enrichments after the default ones •  We perform NRT identity stitching and sessionization   •  Modeling: mix of Spark and SQL jobs •  Storage: Spark apps that use scala-hadoop-shred as a library
  7. 7. Why ? •  We wanted a near real-time pipeline, but KCL was too rigid: •  Provision, set up and monitor the machines •  Configuration is difficult for complex DAGs •  In contrast, Spark: •  Once set up, the cluster is a PaaS •  Allows streaming, batch, ML and graph workloads •  Allows analysts and data scientists to use Python
  8. 8. Radio Campaign
  9. 9. The Radio Campaign •  We’re running a radio campaign in Birmingham, Manchester and London •  People that get a quote starting from our radio landing pages get £25 discount        
  10. 10. The Banner •  The questionnaire to get quotes can be quite long to complete •  We wanted to reassure our customers that they would get the discount •  We wanted to display a banner at the top through all the pages of the questionnaire        
  11. 11. The Banner
  12. 12. Our Infrastructure Spark  Stream   NRT  Enrichment   Scala  Stream   Collector   Kinesis   MongoDB   Visitor  API  QuoBng  App   HTTP   On average, it takes 2.5s for an event to be available in the Visitor API
  13. 13. Benefits of NRT Snowplow •  Our quoting app does not need to know about marketing, user landing pages, etc. •  Our Mongo table with active sessions’ events becomes a view of our event log •  Can be reused for many other use cases: analytics on read!        
  14. 14. Telephony System
  15. 15. Telephony System •  We have a call center in Northampton with around 200 consultants •  We used an off-the-shelf telephony system •  It worked well for a long time, but: •  Was not very well integrated with our systems •  Quite rigid, we couldn’t adapt it to all our needs •  We had daily reports and they contained aggregated data        
  16. 16. Telephony System •  We decided to replace it with a home grown, Twilio-based solution •  Components: •  Contact Strategy Manager •  Voice Channel Manager •  Communication is event-based •  We transform those events into Snowplow’s unstructured •  Spark Streaming app to insert the events into Redshift every 2min        
  17. 17. The Infrastructure Spark  Stream   NRT  Enrichment   Scala  Stream   Collector   Kinesis   Kinesis   RedshiD   Spark  Stream   Shredder   Looker   Contact  Strategy   Manager   Voice  Channel   Manager   Event   Translator  
  18. 18. Events Example call when viewed as sequence of events:        
  19. 19. Benefits of NRT Snowplow •  Event Sourcing is great for reporting and analytics: ensures that data quality remains high •  Team managers now have a NRT view of what teams are doing •  You can aggregate and drill down on the data as appropriate •  Leveraging our data platform: Snowplow pipeline, Redshift & Looker •  Leveraging our existing skills: everyone knows how to use Looker        
  20. 20. Sum Up
  21. 21. The Infrastructure Spark  Stream   NRT  Enrichment   Scala  Stream   Collector   Kinesis   MongoDB   Kinesis   RedshiD   Spark  Stream   Shredder   Visitor  API   Looker  ApplicaBons  
  22. 22. NRT Benefits •  We can dynamically alter the website while the user is still using it •  We can provide insights on live processes •  Multiple uses to improve conversion: •  Instant inclusion/exclusion from remarketing lists •  Abandoned cart emails/calls •  Social proofing (3 more people are also watching…) •  …        
  23. 23. Questions? @dani_sola dani.sola@simplybusiness.co.uk