Capturing online customer data to create better insights and targeted actions using Snowplow - SDU presentation
Presention Snowplow
Meetup
19-05-2016 Page 1
Capturing online customer data to create better
insights and targeted actions using Snowplow
Snowplow Meetup
Sander Knol & Tamara de Heij
19th May, 2016
Presention Snowplow
Meetup
19-05-2016 Page 4
Who are we?
SDU is a publisher that supplies current information on law and regulations to
lawyers, tax experts, policy makers and other legal professionals
Traditional company in transition
300+ employees
We believe in creating content / product to the wishes of our customers , because
progress is different for everybody
Both off- and online content/products
Presention Snowplow
Meetup
19-05-2016 Page 5
Why did we want this?
• Ownership data
• Open generic tools (no vendor
lock-in)
• Ability to give support internally
And not be reliable on external
suppliers
IT
• Improving customer journey
• Insights in product use
• Future wish: reacting realtime to
triggers in market
Marketing
• Insights in Acquisition –
development – retention – winback
• Ask and answer business
questions
• Integration of customer behavior in
marketing database
Marketing
intelligence
• Integration offline and online.
• In depth analytical possibilities on
top of google analytics
• Optimal mix of advertising budget
E-commerce
Presention Snowplow
Meetup
19-05-2016 Page 6
What steps did we take?
Develop
Powerpitch
Longlist
Shortlist
Choice
Management
Decision based
on PAP
Implementation
in POC
Transfer to
organisation
Proof in use
cases
Learning
Presention Snowplow
Meetup
19-05-2016 Page 7
• Implementing Snowplow in the cloud
• Implementing Apache Spark in the cloud
• Incloud database with all the captured data
• Alignment with Google Universal
Delivering the Intelligence Platform:
Snowplow + Spark
Presention Snowplow
Meetup
19-05-2016 Page 8
The Delivered Intelligence Platform Using Snowplow and Spark
Behavioral
Data
Click
data
Capture and store data Analyse the data
Presention Snowplow
Meetup
19-05-2016 Page 9
The Delivered Intelligence Platform – Alignment with Google Universal
Intelligence platform - Snowplow / Spark
• Unlimited external data
• Advanced reporting through tools
• Advanced Machine Learning options
• Customer id + fingerprint + IP
• Full export options
Universal Analytics
• Limited external data
• Slice and dice in frontend user system
• No machine learning options
• Upload a customer id in a dimension
• Limited export options
Presention Snowplow
Meetup
19-05-2016 Page 10
Planning 6 weeks Proof Of Concept (POC)
Week 1
•Security certificates
•First (generic) tags and triggers in GTM
Week 2
•Second batch of tags and triggers in GTM
•Test of the snowplow data and first EDA
Week 3
•Implementation of Databricks / Spark
•Setting the connection to Snowplow S3 and Redshift
Week 4
•Start of use cases
Week 5
•Finalization of use cases
• Budget calculations for future tools (with cloud computing not so straightforward)
Week 6
•Wrap up project
•End presentation
Presention Snowplow
Meetup
19-05-2016 Page 11
What were our Technical learnings / findings
Security certifications in AWS
IT expertise with experience in
network and AWS
Complex Google Analytics
implementation
Completeness of the tracking
Combining off- and online data
Account structure in AWS
Using multiple accounts good
for governance, more complex
in use (whitelisting IP)
Data collection through GTM (=
browser side) is not 100%
complete. Neither is GA.
Implement key in datalayer.
You need web developers
Either start with clean
implementation, or plan
accordingly
Presention Snowplow
Meetup
19-05-2016 Page 13
Use Case 1: The Correlation Between Site Visits and Products Put in the Basket
• Products (below, right) are visited frequently,
but are not often added to the basket.
• Products (upper left) are not frequently visited,
but are often added to the basket
• Is the price of some products too high or too
low?
• Are pages difficult to find?
• Is there a difference between our high valued
customers vs low valued customers?
Insights
Implications
Information
Presention Snowplow
Meetup
19-05-2016 Page 14
Use Case 2: Most Frequently Visited Service Pages
• Top 10 of webpages related to service
• The top (detailed) service webpage is
‘abonnement-opzeggen’ (cancel subscription)
• 75% (57% + 19%) of the sessions that visit this
page, continues to the cancellation form.
• In 25% of the sessions the customer uses
another form, i.e. the general contact form
(instead of or on top of the cancellation form)
• Cancellations reach Sdu not in different ways.
Are the forms processed similarly?
Insights
Implications
Information
Cancellation form
No Yes
Contact No 19% 57%
form Yes 5% 19%
Presention Snowplow
Meetup
19-05-2016 Page 15
Use Case 3: Search Pages
• 6 Distinct clusters, of which ‘zoekers’
(searchers) is a small group with relatively high
revenue
• What can we do to leverage the relatively large
group of visits with no revenue that visits
predominantly in the evening? Are these
private people visiting our site?
• Hypothesis: the searchers have a need for a
specific product. Further research and a/b
testing is advised; specifically on search.
Insights
Implications
Information
Presention Snowplow
Meetup
19-05-2016 Page 17
How are we organized for Snowplow?
Sdu
Marketing &
Sales
Marketing
Intelligence
- Analyses
using SQL
(Redshift)
and R and
Python
(Databricks)
E-commerce
- Google Tag
Manager
implementation
IT
Architecture
and
infrastructure
- Alignment with
current and
future business
architecture
- Technical
support
Business
Analist
- Translating
Business needs
into technical
design
Presention Snowplow
Meetup
19-05-2016 Page 18
Which are the next steps for Sdu?
• Duplicates: create a script to deduplicate current and future records.
• Implement server-side tracker as a solution to prevent missing web shop transactions.
• Assess low-cost alternative to the use of the Redshift database (AWS) for the long term.
• Structural solution for security Redshift database (whitelisting IP address of Databricks cluster)
Technical next steps
• Determining KPI’s
• Measuring product use
• Analysing data and determine next action
Supporting lean startup
• Answer Business questions on customer behaviour
• Answer questions not asked
• Tracking product use
Learning
Presention Snowplow
Meetup
19-05-2016 Page 19
Key take-aways and recommendation
Involve senior management from start
POC of 6 weeks is realistic
Share quick wins / successes for acceptance of the
project