SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
BIG DATA

IN HYBRID WORLDS
The Story of M
I’m Florian
CEO of Dataiku
maker Data	
  Science	
  Studio,

the « Photoshop for Data Science »
COMMUNITY	
  EDITION	
  (it’s	
  FREE)	
  	
  
http://www.dataiku.com/dss/trynow/
H i !
React on twitter
@fdouetteau
#BigDataParis
B i g o r S m a l l
Startup Big Firm
H O W D O P E O P L E TA K E D E C I S I O N S
B U Y I N G D E C I S I O N S
Should I
buy it ?
S O C I A L D E C I S I O N S
Should I
talk to him ?
M
LIKE MEETING
B u s i n e s s D e c i s i o n s
B u s i n e s s I n t e l l i g e n c e
B u s i n e s s I n t e l l i g e n c e
IN 2001
man (actually Gartner)
invented
big data
Volume Variety Velocity
WHAT IF THE META GROUP
HAD CHOSEN ANOTHER LETTER?
Capacity Complexity Celerity
Size Serendipity Speed
Big Blur Blazing
Or Combine
Com….. Bu.. Sh..
BIG DATA RELIGION ?
M
LIKE METRICS
M L I K E M E T R I C S
How much does it cost to
produce and maintain a
metric ?
How many metrics do I need ?
Do I Follow the right metrics ?
Do I Have enough data ?
Do I Have enough Data?
• Self-Service

Build your own metrics
• Analytical Capabilities

Find your patterns

• Large Volume

Store it all
M o r e M e t r i c s M e a n s M o r e M e a n s
DATA
MINING
M o r e M e t r i c s M e a n s M o r e A p p l i c a t i o n
Mission
Critical
Small
Structured
Large
Diverse
Sheer
Curiosity
Reporting
for Finance
in Any Industry
Analyze
Each Tweet
Web Navigation

For E-Merchant
Ticket Data
For Discounts
in Retail
Phone Call
Logs for Security
RTB Data
For Advertising
Customer Consumption
For Anti-Churn
in Utilities
CLASSIC BI
LARGE
PRODUCTION
PLATFORM
DATA
EXPLORATION
Optimization
Filings
For Fraud
in Insurance
D
DATA
MINING
TO DAY E A C H O W N A S I T S S TO R E
Mission
Critical
Small
Structured
Large
Diverse
Sheer
Curiosity
CLASSIC BI
LARGE
PRODUCTION
PLATFORM
DATA
EXPLORATION
Optimization
DATA
WAREHOUSING
DATA MINING
REPOSITORIES DATA LAKE
GOOGLE LIKE
PLATFORM
i t ’s n o t j u s t a b o u t t h e m e t r i c s
DATA D R I V E N B U S I N E S S
P r o b l e m i s t h e h u m a n
Cannot take decisions in seconds
Limited sight (100 rows)
Limited short term memory (10k rows)?
M
LIKE MACHINE
R i s e o f A I
1997 Deep Blue 2011 Watson’s Jeopardy
2012 Google Cat
2005 Autonomous
Vehicule
1974 - 1993
AI Winters
www.dataiku.com
Churn
Volume Forecast
RecommenderSegmentation LifetimeValue
Risk Score Hot Location
Pricing Ranking FraudEvent Paths
APPLICATIONS OF
MACHINE LEARNING TO
BUSINESS PROBLEMS
P R E D I C T I V E M A I N CO N F O R T Z O N E
Mission
Critical
Small
Structured
Large
Diverse
Sheer
Curiosity
Reporting
for Finance
in Any Industry
Analyze
Each Tweet
Web Navigation

For E-Merchant
Ticket Data
For Discounts
in Retail
Phone Call
Logs for Security
RTB Data
For Advertising
Customer Consumption
For Anti-Churn
in Utilities
Optimization
Filings
For Fraud
in Insurance
Not Enough
Data To Learn
From ?
Not Enough
“Hard" Examples
So that you can learn
Welcome to Technoslavia
Hadoop
Ceph
	 Sphere
Cassandra
Kafka Flume
Spark
	
Scikit-Learn GraphLAB
prediction.io jubatus
Mahout
	 WEKA
MLBase LibSVM
RapidMiner
	 	
	 Panda
Kibana
InfiniDB Drill
	 Spark SQL
Hive
Impala
…
Elastic Search
SOLR
	 MongoDB
Riak
	 Membase
Pig
Cascading
Talend
Machine Learning
Mystery LandScalability Central
SQL Colunnar Republic
Vizualization County Data Clean Wasteland
Statistician Old
House
R
Real-time island
Storm
NOSQL Nihiland
E m b r a c e M a n y S k i l l s M a n y - S e t s
Data
Plumberer
BI
Manager
Data
Scientist
Data
Waiter
Data
Cleaner
Business
Analyst
REAL
JOB
DREAM
JOB
• Reformulation de la
recherche
• Pas de réponse
• Clic sur un pro
• Top recherche
• Clic de navigation ou filtre
COMMENT AMÉLIORER LA PERTINENCE DE NOS RÉPONSES 

VIA L’ANALYSE DU COMPORTEMENT UTILISATEUR ?
20 M
Analyse &
corrections
automatisation
>10
occurrences1,4M
requêtes
>200M
recherches
✗ ✓
0,5M requêtes
priorisées
"PREDICTIVE CONTENT MANAGEMENT”
FROM PAGES JAUNES
Machine
Gestion Exploration
pagesjaunes.fr
Annuaire
hadoop PIG+Hive
Exportindexation
Moteur
d’interprétation
crawl
Autres
référentiels
Sickit-learn
O p t i m i z i n g L a s t M i l e w i t h
D a t a S c i e n c e S t u d i o
Data Science Studio
Historical delivery
and retrieval data
Modeling of a score
for each delivery
Cleaning and temporal
enrichment of data
Data aggregation by
geographic location
Incorporation of new deliveries
to the existing model
by
E X P LO R E N E W W O R D S
Mission
Critical
Small
Structured
Large
Diverse
Sheer
Curiosity
Optimization
Optimize
Existing
BI Capabilities Build Mandatory
Large Volume Capabilities
EXPLORE POTENTIAL
NOT BEING RELEVANT
DANGER ZONE
Analytics
Predictive
Self Service
Cluster
www.dataiku.com

Contenu connexe

Tendances

Dataiku - data driven nyc - april 2016 - the solitude of the data team m...
Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team m...Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team m...
Dataiku - data driven nyc - april 2016 - the solitude of the data team m...Dataiku
 
How to Build Successful Data Team - Dataiku ?
How to Build Successful Data Team -  Dataiku ? How to Build Successful Data Team -  Dataiku ?
How to Build Successful Data Team - Dataiku ? Dataiku
 
Dataiku r users group v2
Dataiku   r users group v2Dataiku   r users group v2
Dataiku r users group v2Cdiscount
 
Back to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from ScratchBack to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from ScratchKlaas Bosteels
 
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16thDataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16thDataiku
 
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) Dataiku
 
Dataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
 
PASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureMLPASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureMLJen Stirrup
 
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectHow to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectPAPIs.io
 
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs ConnectMachine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs ConnectPAPIs.io
 
H2O World - Data Science in Action @ 6sense - Viral Bajaria
H2O World - Data Science in Action @ 6sense - Viral BajariaH2O World - Data Science in Action @ 6sense - Viral Bajaria
H2O World - Data Science in Action @ 6sense - Viral BajariaSri Ambati
 
Gartner peer forum sept 2011 orbitz
Gartner peer forum sept 2011   orbitzGartner peer forum sept 2011   orbitz
Gartner peer forum sept 2011 orbitzRaghu Kashyap
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku
 
Benchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the MarketBenchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the MarketApigee | Google Cloud
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelDataiku
 
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning MeetupKnowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning MeetupBenjamin Nussbaum
 
Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011Raghu Kashyap
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Looker
 
Better Insights from Your Master Data - Graph Database LA Meetup
Better Insights from Your Master Data - Graph Database LA MeetupBetter Insights from Your Master Data - Graph Database LA Meetup
Better Insights from Your Master Data - Graph Database LA MeetupBenjamin Nussbaum
 

Tendances (20)

Dataiku - data driven nyc - april 2016 - the solitude of the data team m...
Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team m...Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team m...
Dataiku - data driven nyc - april 2016 - the solitude of the data team m...
 
How to Build Successful Data Team - Dataiku ?
How to Build Successful Data Team -  Dataiku ? How to Build Successful Data Team -  Dataiku ?
How to Build Successful Data Team - Dataiku ?
 
Dataiku r users group v2
Dataiku   r users group v2Dataiku   r users group v2
Dataiku r users group v2
 
Back to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from ScratchBack to Square One: Building a Data Science Team from Scratch
Back to Square One: Building a Data Science Team from Scratch
 
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16thDataiku, Pitch Data Innovation Night, Boston, Septembre 16th
Dataiku, Pitch Data Innovation Night, Boston, Septembre 16th
 
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
 
Dataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine LearningDataiku - From Big Data To Machine Learning
Dataiku - From Big Data To Machine Learning
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 
PASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureMLPASS Summit Data Storytelling with R Power BI and AzureML
PASS Summit Data Storytelling with R Power BI and AzureML
 
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs ConnectHow to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
How to Build a Successful Data Team - Florian Douetteau @ PAPIs Connect
 
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs ConnectMachine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
Machine Learning Services Benchmark - Inês Almeida @ PAPIs Connect
 
H2O World - Data Science in Action @ 6sense - Viral Bajaria
H2O World - Data Science in Action @ 6sense - Viral BajariaH2O World - Data Science in Action @ 6sense - Viral Bajaria
H2O World - Data Science in Action @ 6sense - Viral Bajaria
 
Gartner peer forum sept 2011 orbitz
Gartner peer forum sept 2011   orbitzGartner peer forum sept 2011   orbitz
Gartner peer forum sept 2011 orbitz
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
 
Benchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the MarketBenchmarking Digital Readiness: Moving at the Speed of the Market
Benchmarking Digital Readiness: Moving at the Speed of the Market
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML model
 
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning MeetupKnowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
Knowledge Graphs for a Connected World - AI, Deep & Machine Learning Meetup
 
Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011
 
Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016Frank Bien Opening Keynote - Join 2016
Frank Bien Opening Keynote - Join 2016
 
Better Insights from Your Master Data - Graph Database LA Meetup
Better Insights from Your Master Data - Graph Database LA MeetupBetter Insights from Your Master Data - Graph Database LA Meetup
Better Insights from Your Master Data - Graph Database LA Meetup
 

En vedette

OWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuOWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuDataiku
 
HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011Hortonworks
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...Dataiku
 
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Dataiku
 
Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studydeep.bi
 
The US Healthcare Industry
The US Healthcare IndustryThe US Healthcare Industry
The US Healthcare IndustryDataiku
 
Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Dataiku
 
[WMD 2016] Karen X LLC >> Karen X Cheng "Facebook is completely changing vira...
[WMD 2016] Karen X LLC >> Karen X Cheng "Facebook is completely changing vira...[WMD 2016] Karen X LLC >> Karen X Cheng "Facebook is completely changing vira...
[WMD 2016] Karen X LLC >> Karen X Cheng "Facebook is completely changing vira...500 Startups
 
What Makes Content Memorable?
What Makes Content Memorable?What Makes Content Memorable?
What Makes Content Memorable?Bruce Kasanoff
 
Activate Tech and Media Outlook 2016
Activate Tech and Media Outlook 2016Activate Tech and Media Outlook 2016
Activate Tech and Media Outlook 2016Activate
 
Tips, Tools and Templates To Build Your Content Marketing Strategy
Tips, Tools and Templates To Build Your Content Marketing StrategyTips, Tools and Templates To Build Your Content Marketing Strategy
Tips, Tools and Templates To Build Your Content Marketing StrategyMichael Brenner
 
How To Plan And Build A Successful Content Marketing Strategy
How To Plan And Build A Successful Content Marketing StrategyHow To Plan And Build A Successful Content Marketing Strategy
How To Plan And Build A Successful Content Marketing StrategyMichael Brenner
 
Why Social Media Chat Bots Are the Future of Communication - Deck
Why Social Media Chat Bots Are the Future of Communication - DeckWhy Social Media Chat Bots Are the Future of Communication - Deck
Why Social Media Chat Bots Are the Future of Communication - DeckJan Rezab
 

En vedette (17)

OWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - DataikuOWF 2014 - Take back control of your Web tracking - Dataiku
OWF 2014 - Take back control of your Web tracking - Dataiku
 
HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
 
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
Lambda Architecture - Storm, Trident, SummingBird ... - Architecture and Over...
 
Real-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case studyReal-time big data analytics based on product recommendations case study
Real-time big data analytics based on product recommendations case study
 
JOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big DataJOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big Data
 
The US Healthcare Industry
The US Healthcare IndustryThe US Healthcare Industry
The US Healthcare Industry
 
Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem Before Kaggle : from a business goal to a Machine Learning problem
Before Kaggle : from a business goal to a Machine Learning problem
 
[WMD 2016] Karen X LLC >> Karen X Cheng "Facebook is completely changing vira...
[WMD 2016] Karen X LLC >> Karen X Cheng "Facebook is completely changing vira...[WMD 2016] Karen X LLC >> Karen X Cheng "Facebook is completely changing vira...
[WMD 2016] Karen X LLC >> Karen X Cheng "Facebook is completely changing vira...
 
What Makes Content Memorable?
What Makes Content Memorable?What Makes Content Memorable?
What Makes Content Memorable?
 
Activate Tech and Media Outlook 2016
Activate Tech and Media Outlook 2016Activate Tech and Media Outlook 2016
Activate Tech and Media Outlook 2016
 
Tips, Tools and Templates To Build Your Content Marketing Strategy
Tips, Tools and Templates To Build Your Content Marketing StrategyTips, Tools and Templates To Build Your Content Marketing Strategy
Tips, Tools and Templates To Build Your Content Marketing Strategy
 
How To Plan And Build A Successful Content Marketing Strategy
How To Plan And Build A Successful Content Marketing StrategyHow To Plan And Build A Successful Content Marketing Strategy
How To Plan And Build A Successful Content Marketing Strategy
 
How to Choose the Perfect Stock Photo
How to Choose the Perfect Stock PhotoHow to Choose the Perfect Stock Photo
How to Choose the Perfect Stock Photo
 
Why Social Media Chat Bots Are the Future of Communication - Deck
Why Social Media Chat Bots Are the Future of Communication - DeckWhy Social Media Chat Bots Are the Future of Communication - Deck
Why Social Media Chat Bots Are the Future of Communication - Deck
 
Work Rules!
Work Rules!Work Rules!
Work Rules!
 

Similaire à Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team

Big data, Analytics and 4th Generation Data Warehousing
Big data, Analytics and 4th Generation Data WarehousingBig data, Analytics and 4th Generation Data Warehousing
Big data, Analytics and 4th Generation Data WarehousingMartyn Richard Jones
 
Big Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolBig Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolLaurent Kinet
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Denodo
 
Digital Transformation: Why Public Sector Customers are Moving to the Cloud
Digital Transformation: Why Public Sector Customers are Moving to the CloudDigital Transformation: Why Public Sector Customers are Moving to the Cloud
Digital Transformation: Why Public Sector Customers are Moving to the CloudAmazon Web Services
 
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian PloskerThe Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian PloskerJAX London
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017SingleStore
 
Perfect Memory Semantic Digital Asset Management @ Big Media Paris 2016
Perfect Memory Semantic Digital Asset Management @ Big Media Paris 2016Perfect Memory Semantic Digital Asset Management @ Big Media Paris 2016
Perfect Memory Semantic Digital Asset Management @ Big Media Paris 2016ACTUONDA
 
The Business of Big Data - IA Ventures
The Business of Big Data - IA VenturesThe Business of Big Data - IA Ventures
The Business of Big Data - IA VenturesBen Siscovick
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientPerficient, Inc.
 
Slides from GraphDay Santa Clara
Slides from GraphDay Santa ClaraSlides from GraphDay Santa Clara
Slides from GraphDay Santa ClaraNeo4j
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
DEEP Scott Killoh-3
DEEP Scott Killoh-3DEEP Scott Killoh-3
DEEP Scott Killoh-3jonobermeyer
 
Sebastian Amtage - Beyond Marketing Automation: DMP, CDP, CMP. Who Can Still ...
Sebastian Amtage - Beyond Marketing Automation: DMP, CDP, CMP. Who Can Still ...Sebastian Amtage - Beyond Marketing Automation: DMP, CDP, CMP. Who Can Still ...
Sebastian Amtage - Beyond Marketing Automation: DMP, CDP, CMP. Who Can Still ...Heroes of CRM Conference
 
Telling The Digital Story
Telling The Digital StoryTelling The Digital Story
Telling The Digital StoryIgnitionOne
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Julien SIMON
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data Spain
 

Similaire à Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team (20)

Big data, Analytics and 4th Generation Data Warehousing
Big data, Analytics and 4th Generation Data WarehousingBig data, Analytics and 4th Generation Data Warehousing
Big data, Analytics and 4th Generation Data Warehousing
 
Saas Wars
Saas WarsSaas Wars
Saas Wars
 
Big Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business SchoolBig Data for the Retail Business I Swan Insights I Solvay Business School
Big Data for the Retail Business I Swan Insights I Solvay Business School
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
 
Digital Transformation: Why Public Sector Customers are Moving to the Cloud
Digital Transformation: Why Public Sector Customers are Moving to the CloudDigital Transformation: Why Public Sector Customers are Moving to the Cloud
Digital Transformation: Why Public Sector Customers are Moving to the Cloud
 
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian PloskerThe Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
The Big Data Con: Why Big Data is a Problem, not a Solution - Ian Plosker
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
 
Perfect Memory Semantic Digital Asset Management @ Big Media Paris 2016
Perfect Memory Semantic Digital Asset Management @ Big Media Paris 2016Perfect Memory Semantic Digital Asset Management @ Big Media Paris 2016
Perfect Memory Semantic Digital Asset Management @ Big Media Paris 2016
 
The Business of Big Data - IA Ventures
The Business of Big Data - IA VenturesThe Business of Big Data - IA Ventures
The Business of Big Data - IA Ventures
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
 
Slides from GraphDay Santa Clara
Slides from GraphDay Santa ClaraSlides from GraphDay Santa Clara
Slides from GraphDay Santa Clara
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
DEEP Scott Killoh-3
DEEP Scott Killoh-3DEEP Scott Killoh-3
DEEP Scott Killoh-3
 
Sebastian Amtage - Beyond Marketing Automation: DMP, CDP, CMP. Who Can Still ...
Sebastian Amtage - Beyond Marketing Automation: DMP, CDP, CMP. Who Can Still ...Sebastian Amtage - Beyond Marketing Automation: DMP, CDP, CMP. Who Can Still ...
Sebastian Amtage - Beyond Marketing Automation: DMP, CDP, CMP. Who Can Still ...
 
MapR apache drill
MapR apache drillMapR apache drill
MapR apache drill
 
2014 11 24 big data bmm aformationdigitale2014 v print guy huyberechts
2014 11 24 big data bmm aformationdigitale2014 v print guy huyberechts2014 11 24 big data bmm aformationdigitale2014 v print guy huyberechts
2014 11 24 big data bmm aformationdigitale2014 v print guy huyberechts
 
Telling The Digital Story
Telling The Digital StoryTelling The Digital Story
Telling The Digital Story
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
 
Rulex big data and analytics
Rulex big data and analyticsRulex big data and analytics
Rulex big data and analytics
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
 

Plus de Dataiku

Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Dataiku
 
Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Dataiku
 
04Juin2015_Symposium_Présentation_Coyote_Dataiku
04Juin2015_Symposium_Présentation_Coyote_Dataiku 04Juin2015_Symposium_Présentation_Coyote_Dataiku
04Juin2015_Symposium_Présentation_Coyote_Dataiku Dataiku
 
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015Dataiku
 
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku   big data paris - the rise of the hadoop ecosystemDataiku   big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystemDataiku
 
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku  - for Data Geek Paris@Criteo - Close the Data CircleDataiku  - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku - for Data Geek Paris@Criteo - Close the Data CircleDataiku
 
Data Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from thData Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from thDataiku
 
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku
 
Dataiku - Paris JUG 2013 - Hadoop is a batch
Dataiku - Paris JUG 2013 - Hadoop is a batch Dataiku - Paris JUG 2013 - Hadoop is a batch
Dataiku - Paris JUG 2013 - Hadoop is a batch Dataiku
 

Plus de Dataiku (9)

Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
 
Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...
 
04Juin2015_Symposium_Présentation_Coyote_Dataiku
04Juin2015_Symposium_Présentation_Coyote_Dataiku 04Juin2015_Symposium_Présentation_Coyote_Dataiku
04Juin2015_Symposium_Présentation_Coyote_Dataiku
 
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
Coyote & Dataiku - Séminaire Dixit GFII du 13 04-2015
 
Dataiku big data paris - the rise of the hadoop ecosystem
Dataiku   big data paris - the rise of the hadoop ecosystemDataiku   big data paris - the rise of the hadoop ecosystem
Dataiku big data paris - the rise of the hadoop ecosystem
 
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku  - for Data Geek Paris@Criteo - Close the Data CircleDataiku  - for Data Geek Paris@Criteo - Close the Data Circle
Dataiku - for Data Geek Paris@Criteo - Close the Data Circle
 
Data Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from thData Disruption for Insurance - Perspective from th
Data Disruption for Insurance - Perspective from th
 
Dataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin BuzzwordsDataiku Flow and dctc - Berlin Buzzwords
Dataiku Flow and dctc - Berlin Buzzwords
 
Dataiku - Paris JUG 2013 - Hadoop is a batch
Dataiku - Paris JUG 2013 - Hadoop is a batch Dataiku - Paris JUG 2013 - Hadoop is a batch
Dataiku - Paris JUG 2013 - Hadoop is a batch
 

Dernier

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Dernier (20)

How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

Dataiku - Big data paris 2015 - A Hybrid Platform, a Hybrid Team

  • 1. BIG DATA
 IN HYBRID WORLDS The Story of M
  • 2. I’m Florian CEO of Dataiku maker Data  Science  Studio,
 the « Photoshop for Data Science » COMMUNITY  EDITION  (it’s  FREE)     http://www.dataiku.com/dss/trynow/ H i ! React on twitter @fdouetteau #BigDataParis
  • 3. B i g o r S m a l l Startup Big Firm
  • 4. H O W D O P E O P L E TA K E D E C I S I O N S
  • 5. B U Y I N G D E C I S I O N S Should I buy it ?
  • 6. S O C I A L D E C I S I O N S Should I talk to him ?
  • 7. M LIKE MEETING B u s i n e s s D e c i s i o n s
  • 8. B u s i n e s s I n t e l l i g e n c e
  • 9. B u s i n e s s I n t e l l i g e n c e
  • 10. IN 2001 man (actually Gartner) invented big data Volume Variety Velocity
  • 11. WHAT IF THE META GROUP HAD CHOSEN ANOTHER LETTER? Capacity Complexity Celerity Size Serendipity Speed Big Blur Blazing
  • 15. M L I K E M E T R I C S How much does it cost to produce and maintain a metric ? How many metrics do I need ? Do I Follow the right metrics ? Do I Have enough data ? Do I Have enough Data?
  • 16. • Self-Service
 Build your own metrics • Analytical Capabilities
 Find your patterns
 • Large Volume
 Store it all M o r e M e t r i c s M e a n s M o r e M e a n s
  • 17. DATA MINING M o r e M e t r i c s M e a n s M o r e A p p l i c a t i o n Mission Critical Small Structured Large Diverse Sheer Curiosity Reporting for Finance in Any Industry Analyze Each Tweet Web Navigation
 For E-Merchant Ticket Data For Discounts in Retail Phone Call Logs for Security RTB Data For Advertising Customer Consumption For Anti-Churn in Utilities CLASSIC BI LARGE PRODUCTION PLATFORM DATA EXPLORATION Optimization Filings For Fraud in Insurance
  • 18. D DATA MINING TO DAY E A C H O W N A S I T S S TO R E Mission Critical Small Structured Large Diverse Sheer Curiosity CLASSIC BI LARGE PRODUCTION PLATFORM DATA EXPLORATION Optimization DATA WAREHOUSING DATA MINING REPOSITORIES DATA LAKE GOOGLE LIKE PLATFORM
  • 19. i t ’s n o t j u s t a b o u t t h e m e t r i c s
  • 20. DATA D R I V E N B U S I N E S S
  • 21. P r o b l e m i s t h e h u m a n Cannot take decisions in seconds Limited sight (100 rows) Limited short term memory (10k rows)?
  • 23. R i s e o f A I 1997 Deep Blue 2011 Watson’s Jeopardy 2012 Google Cat 2005 Autonomous Vehicule 1974 - 1993 AI Winters
  • 24. www.dataiku.com Churn Volume Forecast RecommenderSegmentation LifetimeValue Risk Score Hot Location Pricing Ranking FraudEvent Paths APPLICATIONS OF MACHINE LEARNING TO BUSINESS PROBLEMS
  • 25. P R E D I C T I V E M A I N CO N F O R T Z O N E Mission Critical Small Structured Large Diverse Sheer Curiosity Reporting for Finance in Any Industry Analyze Each Tweet Web Navigation
 For E-Merchant Ticket Data For Discounts in Retail Phone Call Logs for Security RTB Data For Advertising Customer Consumption For Anti-Churn in Utilities Optimization Filings For Fraud in Insurance Not Enough Data To Learn From ? Not Enough “Hard" Examples So that you can learn
  • 26.
  • 27. Welcome to Technoslavia Hadoop Ceph Sphere Cassandra Kafka Flume Spark Scikit-Learn GraphLAB prediction.io jubatus Mahout WEKA MLBase LibSVM RapidMiner Panda Kibana InfiniDB Drill Spark SQL Hive Impala … Elastic Search SOLR MongoDB Riak Membase Pig Cascading Talend Machine Learning Mystery LandScalability Central SQL Colunnar Republic Vizualization County Data Clean Wasteland Statistician Old House R Real-time island Storm NOSQL Nihiland
  • 28. E m b r a c e M a n y S k i l l s M a n y - S e t s Data Plumberer BI Manager Data Scientist Data Waiter Data Cleaner Business Analyst REAL JOB DREAM JOB
  • 29. • Reformulation de la recherche • Pas de réponse • Clic sur un pro • Top recherche • Clic de navigation ou filtre COMMENT AMÉLIORER LA PERTINENCE DE NOS RÉPONSES 
 VIA L’ANALYSE DU COMPORTEMENT UTILISATEUR ? 20 M Analyse & corrections automatisation >10 occurrences1,4M requêtes >200M recherches ✗ ✓ 0,5M requêtes priorisées
  • 30. "PREDICTIVE CONTENT MANAGEMENT” FROM PAGES JAUNES Machine Gestion Exploration pagesjaunes.fr Annuaire hadoop PIG+Hive Exportindexation Moteur d’interprétation crawl Autres référentiels Sickit-learn
  • 31. O p t i m i z i n g L a s t M i l e w i t h D a t a S c i e n c e S t u d i o Data Science Studio Historical delivery and retrieval data Modeling of a score for each delivery Cleaning and temporal enrichment of data Data aggregation by geographic location Incorporation of new deliveries to the existing model by
  • 32. E X P LO R E N E W W O R D S Mission Critical Small Structured Large Diverse Sheer Curiosity Optimization Optimize Existing BI Capabilities Build Mandatory Large Volume Capabilities EXPLORE POTENTIAL NOT BEING RELEVANT DANGER ZONE Analytics Predictive Self Service Cluster