1. Hal’s Headache
Data Tuesday
02/25/2013
Florian Douetteau
2. Meet Hal Alowne
Dim Sum
‟
CEO & Founder
Dim’s Private Showroom
Hey Hal ! We need
a big data platform
like the big guys.
”
Hal Alowne
BI Manager Let’s just do as they do!
Dim’s Private Showroom
European E-commerce Web site Big Guys
• 100M$ Revenue Big Data • 10B$+ Revenue
• 1 Million customer Copy Cat • 100M+ customers
• 1 Data Analyst (Hal Himself) Project • 100+ Data Scientist
Dataiku - Data Tuesday 3/8/2013 2
3. CHOOSE TECHNOLOGY
NoSQL-Slavia Scalability Central Machine Learning
Elastic Search Mystery Land
Hadoop Scikit-Learn
SOLR Ceph
MongoDB Cassandra
Sphere Mahout
WEKA
Riak MLBase
Membase
Spark
SQL Colunnar Republic
InfiniDB RapidMiner
R
LucidDB
Pig Panda
Impala Hive
D3 Cascading Statistician Old
Crossfilter Talend House
Vizualization County
Data Clean Wasteland
Dataiku - Data Tuesday 3/8/2013 3
4. LEARN MACHINE
LEARNING STUFF
Try to understand Find People that understand machine learning
myself and all this stuff
Dataiku - Data Tuesday 3/8/2013 4
5. DO IT
Open Data Storm
Megabytes
CRM Hadoop R
Gigabytes
Elastic
Search
Web Logs
Terabytes
SQL D3
Connect things together
Pour Data in
Clean Data
Fix the leaks
Start again
Dataiku - Data Tuesday 3/8/2013 5
6. MERIT = TIME + ROI
TIME : 6 MONTHS ROI : APPS
2013 2014
Targeted
Find the right Choose the
Make it work Newsletter
people technology
(6 months?)
(6 months?) (6 months?)
Recommender
2013
System
Build the lab
(6 months)
• Train People
• Reuse working patterns Dynamic Pricing
Build a lab in 6 months Deploy apps
(rather than 18 months) that actually deliver value
Dataiku - Data Tuesday 3/8/2013 6
7. Dataiku
One Goal One platform with an open source core
‟
Help you build your data lab in
less than six months
Export
Predictions
Manage datasets
and transformations
Impact
Flow
Feedback
Doctor
Continuous
Loopback
Diagnose
Data
”
all-in-one data
scientists D1 Shaker Prepare
Data
distribution
One fake customer A few real ones
Data Is Money
Dataiku - Data Tuesday 3/8/2013 7