2. 5/22/2013Dataiku 2
Collocation
Big Apple
Big Mama
Big Data
Games Analytics
Current Life:
CEO, Dataiku
Tweet about this
@dataiku
@capital_games
Past Life:
Criteo
IsCool Entertainment
Exalead
Hello, My Name is
Florian Douetteau
Available on:
http://www.slideshare.net/Dataiku
3. The Stakes - Summary
5/22/2013Dataiku 3
Million Events
Billion $
Billion Events
Million $
Classic Business
Social Gaming
4. Meet Hal Alowne
5/22/2013Dataiku 4
Big Guys
• 100M$+ Revenue
• 10M+ games
• 10+ Data Scientist
Hal Alowne
BI Manager
Dim’s Private Showroom
Hey Hal ! We need
a big data platform
like the big guys.
Let’s just do as they do!
‟
”European Online Game Leader
• 10M$ Revenue
• 1 Million monthly games
• 1 Data Analyst (Hal Himself)
Wave Pox
CEO & Founder
W’ave G’ ames
Big Data
Copy Cat
Project
5. MERIT = TIME + ROI
5/22/2013Dataiku 5
Targeted
Newsletter
For New Comers
Facebook
Campaign
Optimization
Adapted Product
/ Promotions
TIME : 6 MONTHS ROI : APPS
Build a lab in 6 months
(rather than 18 months)
Find the right
people
(6 months?)
Choose the
technology
(6 months?)
Make it work
(6 months?)
Build the lab
(6 months)
Deploy apps
that actually deliver value
2013 2014
2013
• Train People
• Reuse working patterns
7. Our Goal
5/22/2013Dataiku 7
It’s utterly complex and
unreasonable
Our Goal:
Change his perspective
on data science projects
(sorry, we couldn’t
find a picture of Hal
Smiling)
8. Do the Basics
Understand Analytics
What to expect out of analytics
Quick Agenda
5/22/2013Dataiku 8
10. Do you track ?
◦ Customer Goals For
most important
features
◦ Time Spent
Level Progresison
Money Spent
◦ Campaigns and
generated campaign
Value
5/22/2013Dataiku 10
Suggestion #1
Check The Basics
11. Do A/B Tests
◦ Use Proven Solutions
◦ Start small (button size
and color)
◦ Check Impacts
◦ Treat new and existing
users differently
◦ Don’t give up after the
first A/B Test
5/22/2013Dataiku 11
Suggestion #2
DO A/B Tests (and not yourself)
12. Register Now / Give
Email Graphics:
From 25% to 2X More
Clicks
http://bit.ly/VOruXt
Changing button
from green to red:
Up to 21%
http://bit.ly/qFEBdK
5/22/2013Dataiku 12
Some Results
A/B Tests
14. Can be Built on top
of your production
systems
Do you have
◦ Cohorts
◦ Daily $$ Reports
◦ Basic $$ Segments
5/22/2013Dataiku 14
Suggestion #3
Have the Basic BI
15. Defined Customer Segments
◦ New Installs
◦ Engaged Users
◦ Engaged Paying Users
◦ …?
Defined Customer Sources
◦ Social Ads / Social Posts / .. Top Charts
/ …
◦ Country Segments
Do you have for each segment, evey
day
◦ Rolling last 30 days ARPUU ?
◦ Rolling last 30 days DAY ?
Do you follow every week
◦ The Segment Conversion Rate per
source ?
5/22/2013Dataiku 15
Sample Check list
(Gaming)
17. Product Success
driven by Quality
Margin / Customer
Value / Traffic /
Acquisition
5/22/2013Dataiku 17
At the Beginning
18. Margin for new
customers might
decline …
Margin for new
features might
decline …
Is your business
really scalable ?
5/22/2013Dataiku 18
But when you continue growing
19. Existing Customers
Existing Product Assets
Existing Specific
Business Model
And your KNOWLEDGE
of it
5/22/2013Dataiku 19
Where is your core business
advantage ?
20. 5/22/2013Dataiku 20
Data Driven Business
What your value ?
Number of
Customers
Customer Knowledge
Increase over time with:
- Time spend in your app
- User relationship (network effet)
- Partner / Other Apps Interactions
Your Value
21. 5/22/2013Dataiku 21
To Apply It ?
Product Optimization
Customer Acquisition
Optimization
Recommender/
Targeting for
newsletters
22. Dark Side
◦ Technology
Bright Side
◦ Business
5/22/2013Dataiku 22
Apply It !!
24. Technology is complex
5/22/2013Dataiku 24
Hadoop
Ceph
Sphere
Cassandra
Spark
Scikit-Learn
Mahout
WEKA
MLBase
RapidMiner
Panda
D3
Crossfilter
InfiniDB
LucidDB
Impala
Elastic Search
SOLR
MongoDB
Riak
Membase
Pig
Hive
Cascading
Talend
Machine Learning
Mystery Land
Scalability CentralNoSQL-Slavia
SQL Colunnar Republic
Vizualization County
Data Clean Wasteland
Statistician Old
House
R
25. Machine learning is complex
5/22/2013Dataiku 25
Find People that understand machine learning
and all this stuff
Try to understand
myself
26. Plumbing is not complex
(but difficult)
5/22/2013Dataiku 26
Implicit User Data
(Views, Searches…)
Content Data
(Title, Categories, Price, …)
Explicit User Data
(Click, Buy, …)
User Information
(Location, Graph…)
500TB
50TB
1TB
200GB
Transformation
Matrix
Transformation
Predictor
Per User Stats
Per Content Stats
User Similarity
Rank Predictor
Content Similarity
28. People Microsoft Excel
5/22/2013Dataiku 28
How did you build your great
product ?
29. Data Team Data Tools
5/22/2013Dataiku 29
How will you continue growing your
great product(s) ?
The Business Guy
who knows maths
The Crazy Analyst
that reveals patterns
The Coding Guy That
is enthusiastic
30. data lab, (n. m): a small group
with all the expertise, including
business minded people,
machine learning knowledge and
the right technology
A proven organization used by
successful data-driven
companies over the past few
years (eBay, LinkedIn, Walmart…)
TEAM + TOOLS= LAB
5/22/2013Dataiku 30
31. Short Term Focus Long Term Drive
Business People Optimize Margin, …. Create new business
revenue streams
Marketing People Optimize click ratio Brand awareness and
impact
IT People Make IT work Clean and efficient
Architecture
Data People Get Stats Right, make
predictions
Create Data Driven
Features
It’s just a new team
5/22/2013Dataiku 31
33. You can’t
« design »
insights, you
explore and
discover them…
Iterate quickly
with constant
feedback
Try a lot, don’t
be afraid to fail!
Free
but not as “free beer”
5/22/2013Dataiku 33
Function
Form
Experience
Emotion
Surprise
Culture
Explore
and Refine
Experiment
Generate
Ideas
Select &
Develop
Enhance
or
Discard
Gather
Feedback
36. Classic Columnar Architecture
5/22/2013Dataiku 36
Lots of data Some Place To
Pour It In
Some Tool To
To Some Maths And Graphs
Web Tracking Logs
Raw Server Logs
Order / Product / Customer
Facebook Info
Open Data (Weather, Currency …)
37. The Corinthian Architecture
5/22/2013Dataiku 37
Lots of data
Some Place
To Perform
Rapid Calculations
Some Tools To
Do Some Maths
And Charts
Some Place To
Pour It In And
Clean / Prepare It
38. The Corinthian Architecture
5/22/2013Dataiku 38
Lots of data
Some Place
To Perform
Rapid Calculations
Some Tools To
Do Some Maths
And Charts
Some Place To
Pour It In And
Clean / Prepare It
Statistics
Cohorts
Regressions
Bar Charts For Marketing
Nice Infography for you Company Board
39. The Corinthian Architecture
5/22/2013Dataiku 39
Lots of data
Some Database
To Perform
Rapid Calculations
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
40. The One Database won’t
make it all problem
5/22/2013Dataiku 40
Lots of data
Some Database
To Perform
Rapid Calculations
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
JOIN / Aggregate
Rapid Goup By Computations
Direct Access to the computed Results
to production etc..
41. The Roman Social Forum
5/22/2013Dataiku 41
Lots of data
Some Database
To Perform
Rapid Calculations
And some database
for graphs
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
42. The Key Value Store
5/22/2013Dataiku 42
Lots of data
Some Database
To Perform
Rapid Calculations
And some database
for graphs And
Some Distributed Key
Value Store
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
43. Action requires Prediction
5/22/2013Dataiku 43
Lots of data
Some Database
To Perform
Rapid Calculations
And some database
for graphs And
Some Distributed Key
Value Store
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts
Some Place To
Pour It In And
Clean / Prepare It
Draw A Line For the future
What are my real users groups ?
Should I launch a discount offering or not ?
To everybody or to specific users only ?
44. The Medieval Fairy Land
5/22/2013Dataiku 44
Lots of data
Some Tools To
Do Some Maths
Some Other
To Do Some
Charts and some
MACHINE LEARNING
Some Place To
Pour It In And
Clean / Prepare It
Some Database
To Perform
Rapid Calculations
And some database
for graphs And
Some Distributed Key
Value Store
46. Launch A Marketing
campaign
After a few days
PREDICT based on
behaviours
◦ Total ARPU for users
after 3 months
◦ Efficiency of a campaign
◦ Continue or not ?
Example
Marketing Campaign Prediction
Dataiku 46
47. A very large community
Some mid-size
communities
Lots of small clusters
mostly 2 players)
Correlation
◦ between community size
and engagement / virality
Meaningul patterns
◦ 2 players patterns
◦ Family play
◦ Group Play
◦ Open Play (language
community)
Example
Social Gaming Communities
5/22/2013Dataiku 47
48. Two-Way Clustering
◦ Assess customer behaviours
◦ Assess items equivalent classes
Modeling + Simulation
◦ Evaluate free items / item bought
ration per item kind
◦ Simulate future rules
◦ Sensibility to price evaluation
Enhance customer buy
recurrence
Example
Fremium Model Optimization
5/22/2013Dataiku 48
Business
Model
User
Profiling
Simulation