H2O World - What you need before doing predictive analysis - Keen.io
1. HOW TO SET UP FOR SUCCESS
WITH PREDICTIVE ANALYTICS
https://keen.io
@keen_io
2. WHO WE ARE
Maria Dumanis - @zelashoe
Data Modeling Architect, Keen IO
Helps enterprise customers model data
that enables them to answer
specific questions
Peter Nachbaur -@peternachbaur
Analytics Platform Architect, Keen IO
Early Vikeen, helped design and build
the platform.
3. KEEN
: having or showing an ability to think clearly and to
understand what is not obvious or simple about
something. ref - Merriam Webster
4. WHAT IS KEEN IO?
SMART DEVICES
MOBILE APPS
WEBSITES
TEAMS
CUSTOMERS
ANYWHERE
CLOUD DATABASE +
ANALYTICS APIS
events insights
page loaded
ad viewed
link clicked
purchase completed
article shared
error returned
video played
count
sum
min
max
average
median
percentile
funnel
extraction
streaming
6. Understand How
How will you acquire and collect the right data?
How will you analyze and transform it to be used with predictive models?
7. Champion - You need to ensure that you have someone on your
team that understands why you are doing what you are doing and
will enable you to do what you need to do
Stakeholders - You need to understand who your stakeholders are
to ensure that they can voice what type of information they need
to make informed decisions
Diverse Team - You need to have a team with various domain
expertise to ensure that the many components that allow you to
perform analytics can interact with each other
Creating a Team
8. Know what data can be collected
Architect the model to provide you answers to your questions
Optimize query performance
Performance is Important
9. Data Modeling
Tracking Signups
• User Ids, Geolocation, User Referral, Acquisition Cost, Cohort
Tracking Page Shares
• Cookie IDs, Shared URL, Topic Tag
Video Plays
• Duration watched, Time of Day
10. Know Your Tools
Keen is great at modeling event data over time
Relational DBs are great for capturing the present state of
entities
Hadoop is great for joining and crunching Keen and Relational
DBs together
11. Deciding which analytics tool/API to use
Combining data from various data sources
Maintaining data integrity and minimizing duplication
Sending data as soon as it’s available
Understanding data privacy and who is allowed to access it
Enriching data if needed/possible
Scaling to accommodate business/product growth
Enabling integration with previously collected data
Collection Challenges