SlideShare une entreprise Scribd logo
1  sur  27
Big Data LDN 2018
London | November 14th 2018
André Balleyguier
Chief Data Scientist EMEA at DataRobot
How automation can accelerate the
delivery of Machine Learning
© DataRobot, Inc. All rights reserved.
Who am I?
Data Scientist
London
Previously... @andrebalg
© DataRobot, Inc. All rights reserved.
Talk Outline
1. Why do so many Machine Learning projects fail?
2. Automated ML: a new trend to
address the bottlenecks?
3. How can Automated ML help?
© DataRobot, Inc. All rights reserved.
Beyond the buzz...
“Machine Learning is a core, transformative way by
which we’re rethinking how we’re doing everything.”
Sundar Pichai, CEO Google, 2015
© DataRobot, Inc. All rights reserved.
… the Reality!
“80% of Data Science projects never
go to production!”
Some (a lot) of my customers
Most people in the field?
“We mainly build prototypes”
© DataRobot, Inc. All rights reserved.
So many tools to make it easier!...
● Large communities
● Tools more accessible than ever
● Open-source driven abstraction of ML
Keras
R Caret
CatBoost
PaddlePaddle
LightGBM
FastText
Released in the last
year
© DataRobot, Inc. All rights reserved.
… So why is it so hard?
© DataRobot, Inc. All rights reserved.
The Machine Learning Life Cycle
Discovery &
Problem
definition
Data stuff
Source Identification
Munging
Wrangling
Ninja-ing...
Business problem
Value statement
Formulation into ML
Define usage
Model
development
Socialisation Operationalise
Pre-processing
Feature engineering
Parameter tuning
Diagnostics
Validation
Model selection
Derive insights
Communicate results
Quantify ROI
Deploy into production
Documentation
Monitoring
Maintenance
In theory, a Machine Learning project is a simple iterative flow:
© DataRobot, Inc. All rights reserved.
1) Data munging, our favorite worst nightmare
Data stuff Model
development
Socialisation Operationalise
Discovery &
Problem
definition
Data cleansing first, then modelling
No iterative loop?
“Blind Data Prep”
© DataRobot, Inc. All rights reserved.
2) Deployment into “production”
Discovery &
Problem
definition
Data stuff
Model
development
Socialisation Operationalise
Initially on-board.
then...
Then runs away…
➔ Cost of deployment is too high,
low budget or low ROI
“At Microsoft, 8 of the 16 data scientists interviewed work or manage the
operationalization of predictive models”
Microsoft Research, 2015
© DataRobot, Inc. All rights reserved.
3) The Data Science Hype
“Machine Learning development is like the raisins in a raisin bread: 1. You need
the bread first 2. It’s just a few tiny raisins but without it you would just have
plain bread.” Peter Norvig
Red flags to be careful about:
1. Deceitful hype
“Let’s analyse all our images with Deep Semantic Context-Aware Neural Nets to
refine our customer journey!!!”
2. No clear and measurable business objective
“This dataset looks interesting, let’s analyse it!”
“Most of the Economic value generated today utilizes supervised learning on
structured data, but unstructured stories make for better PR.”
Andrew Ng, Oct 6th 2017
© DataRobot, Inc. All rights reserved.
4) Lack of business buy-in after prototype
Discovery &
Problem
definition
Data stuff
Model
development
Socialisation Operationalise
Initially on-board. Hype?
then...
Then runs away…
➔ No buy-in, low understanding
© DataRobot, Inc. All rights reserved.
5) Wanted: Experienced Data Scientist
➔ Hard to find
➔ Hard to retain
➔ Highly-paid
Not scalable for multiple
end-to-end projects
© DataRobot, Inc. All rights reserved.
To summarise...
Common bottlenecks for scaling Machine Learning:
1. Data cleansing is time-consuming and inefficient
2. Cost of model deployment is high
3. Deciphering the hype is not easy
4. Lack of business buy-in after the prototype stage
5. Skilled data scientists are necessary, but hard to
find/retain
How to address those issues?
Automation
© DataRobot, Inc. All rights reserved.
AutoML: A growing trend
May 2017:
● Announcement of Google AutoML at the Google I/O in May 2017
© DataRobot, Inc. All rights reserved.
What is AutoML?
“AutoML is the automated process of algorithm selection,
hyperparameter tuning, iterative modeling, and model assessment.”
Model selection &
tuning
Often based on search or genetic
algorithms to efficiently select
Workflow
automation +
Model Selection
Also automates model analysis,
feature engineering, deployment...
Task specific
Google AutoML
Deep Learning structures
Auto-Weka
Will be based more and more
on metalearning: using ML to
build ML!
© DataRobot, Inc. All rights reserved.
Where can you automate Machine Learning?
Discovery &
Problem
definition
Data stuff
Source Identification
Munging
Wrangling
Ninja-ing...
Business problem
Value statement
Formulation into ML
Define usage
Model
development
Socialisation Operationalise
Pre-processing
Feature engineering
Parameter tuning
Diagnostics
Validation
Model selection
Derive insights
Communicate results
Quantify ROI
Deploy to production
Documentation
Monitoring
Maintenance
© DataRobot, Inc. All rights reserved.
The power of iterations
Discovery &
Problem
definition
Data stuff
Source Identification
Munging
Wrangling
Ninja-ing...
Business problem
Value statement
Formulation into ML
Define usage
Model
Socialisation
Operationalise
Pre-processing
Feature engineering
Parameter tuning
Diagnostics
Validation
Model selection
Derive insights
Communicate results
Quantify ROI
Deploy to production
Documentation
Monitoring
Maintenance
Spend less time on repetitive tasks:
➔ More iterations = better exploration of the problem, more useful models
➔ Faster iterations = more regular feedback from business
© DataRobot, Inc. All rights reserved.
How does automation help with data prep?
“Blind Data Prep” “Informed Data Prep”
Automation can reduce the time to build models, diagnose a
model and derive insights:
Proposed approach:
1. Build first model quickly with only a few key data columns
2. Use the insights, diagnostics and learning to rectify model and pre-process data
accordingly
3. Iterate efficiently, learn from your mistakes rapidly
© DataRobot, Inc. All rights reserved.
Model deployment and maintenance
Workflow automation + Microservices
Automating model development allows:
1. Model refresh automation
2. More efficient monitoring of dataset
shift (model drift)
3. Creation of end-to-end workflows
with ETL pipelines
Generic abstraction of every model:
each model can be controlled and
deployed as a prediction microservice.
➔ REST APIs
➔ Generic applications (e.g Spark)
➔ Portable code generation
“Prediction Store”
“Model Factory”
© DataRobot, Inc. All rights reserved.
AutoML helping with the hype
➔ Fail fast to increase opportunities and focus on the projects with
measurable impact
➔ Enable more business users to use ML to identify the right use cases
© DataRobot, Inc. All rights reserved.
Model socialisation
Faster iteration cycle
=
More regular feedback from the stakeholders
=
More interventions from the business
=
Better stickiness and buy-in
This approach fits perfectly into an “Agile Data Science”
methodology.
© DataRobot, Inc. All rights reserved.
The Data Science experience problem
100’s of models
deployed
10 models deployed
Automation Without
automation
© DataRobot, Inc. All rights reserved.
Will AutoML take my job?
NO!
● Machine Learning is task-oriented. Need an individual to craft the
task, understand the data, the domain...
● Reduces the time spent on “grunt work”
● Paradigm shift: closer to the business, closer to the actual value.
© DataRobot, Inc. All rights reserved.
Let’s embrace automation!
Thanks!

Contenu connexe

Tendances

H2O World - Learning How Humans and Non-Humans Interact with Digital Ads
H2O World - Learning How Humans and Non-Humans Interact with Digital AdsH2O World - Learning How Humans and Non-Humans Interact with Digital Ads
H2O World - Learning How Humans and Non-Humans Interact with Digital AdsSri Ambati
 
The Machine Learning Audit. MIS ITAC 2017 Keynote
The Machine Learning Audit. MIS ITAC 2017 KeynoteThe Machine Learning Audit. MIS ITAC 2017 Keynote
The Machine Learning Audit. MIS ITAC 2017 KeynoteAndrew Clark
 
Intelligent Big Data analytics for the future.
Intelligent Big Data analytics for the future.Intelligent Big Data analytics for the future.
Intelligent Big Data analytics for the future.Shashank Garg
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
 
20151016 Data Science For Project Managers
20151016 Data Science For Project Managers20151016 Data Science For Project Managers
20151016 Data Science For Project ManagersTze-Yiu Yong
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine LearningSri Ambati
 
2019 CDM CIO Summit AI Driven Development
2019 CDM CIO Summit AI Driven Development2019 CDM CIO Summit AI Driven Development
2019 CDM CIO Summit AI Driven DevelopmentChandra Gundlapalli
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6Zhihao Lin
 
From Data to AI with the ML Canvas
From Data to AI with the ML CanvasFrom Data to AI with the ML Canvas
From Data to AI with the ML CanvasAlexandra Petruș
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceLivePerson
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneySri Ambati
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017StampedeCon
 
H2O World - Machine Learning at Comcast - Andrew Leamon & Chushi Ren
H2O World - Machine Learning at Comcast - Andrew Leamon & Chushi RenH2O World - Machine Learning at Comcast - Andrew Leamon & Chushi Ren
H2O World - Machine Learning at Comcast - Andrew Leamon & Chushi RenSri Ambati
 
Your AI Transformation
Your AI Transformation Your AI Transformation
Your AI Transformation Sri Ambati
 
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data ScienceGeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data ScienceMark West
 
Machine learning disrupting car insurance industry
Machine learning disrupting car insurance industryMachine learning disrupting car insurance industry
Machine learning disrupting car insurance industryRudradeb Mitra
 
Weiyan Zhao, Nationwide Insurance - A Decade of Data Science. The Nationwide ...
Weiyan Zhao, Nationwide Insurance - A Decade of Data Science. The Nationwide ...Weiyan Zhao, Nationwide Insurance - A Decade of Data Science. The Nationwide ...
Weiyan Zhao, Nationwide Insurance - A Decade of Data Science. The Nationwide ...Sri Ambati
 
Machine Learning for Auditors: What you need to know - ISACA North America CA...
Machine Learning for Auditors: What you need to know - ISACA North America CA...Machine Learning for Auditors: What you need to know - ISACA North America CA...
Machine Learning for Auditors: What you need to know - ISACA North America CA...Andrew Clark
 
Machine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementMachine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementQuantUniversity
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017StampedeCon
 

Tendances (20)

H2O World - Learning How Humans and Non-Humans Interact with Digital Ads
H2O World - Learning How Humans and Non-Humans Interact with Digital AdsH2O World - Learning How Humans and Non-Humans Interact with Digital Ads
H2O World - Learning How Humans and Non-Humans Interact with Digital Ads
 
The Machine Learning Audit. MIS ITAC 2017 Keynote
The Machine Learning Audit. MIS ITAC 2017 KeynoteThe Machine Learning Audit. MIS ITAC 2017 Keynote
The Machine Learning Audit. MIS ITAC 2017 Keynote
 
Intelligent Big Data analytics for the future.
Intelligent Big Data analytics for the future.Intelligent Big Data analytics for the future.
Intelligent Big Data analytics for the future.
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
20151016 Data Science For Project Managers
20151016 Data Science For Project Managers20151016 Data Science For Project Managers
20151016 Data Science For Project Managers
 
Towards Human-Centered Machine Learning
Towards Human-Centered Machine LearningTowards Human-Centered Machine Learning
Towards Human-Centered Machine Learning
 
2019 CDM CIO Summit AI Driven Development
2019 CDM CIO Summit AI Driven Development2019 CDM CIO Summit AI Driven Development
2019 CDM CIO Summit AI Driven Development
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6
 
From Data to AI with the ML Canvas
From Data to AI with the ML CanvasFrom Data to AI with the ML Canvas
From Data to AI with the ML Canvas
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
 
H2O World - Machine Learning at Comcast - Andrew Leamon & Chushi Ren
H2O World - Machine Learning at Comcast - Andrew Leamon & Chushi RenH2O World - Machine Learning at Comcast - Andrew Leamon & Chushi Ren
H2O World - Machine Learning at Comcast - Andrew Leamon & Chushi Ren
 
Your AI Transformation
Your AI Transformation Your AI Transformation
Your AI Transformation
 
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data ScienceGeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
 
Machine learning disrupting car insurance industry
Machine learning disrupting car insurance industryMachine learning disrupting car insurance industry
Machine learning disrupting car insurance industry
 
Weiyan Zhao, Nationwide Insurance - A Decade of Data Science. The Nationwide ...
Weiyan Zhao, Nationwide Insurance - A Decade of Data Science. The Nationwide ...Weiyan Zhao, Nationwide Insurance - A Decade of Data Science. The Nationwide ...
Weiyan Zhao, Nationwide Insurance - A Decade of Data Science. The Nationwide ...
 
Machine Learning for Auditors: What you need to know - ISACA North America CA...
Machine Learning for Auditors: What you need to know - ISACA North America CA...Machine Learning for Auditors: What you need to know - ISACA North America CA...
Machine Learning for Auditors: What you need to know - ISACA North America CA...
 
Machine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementMachine Learning and AI in Risk Management
Machine Learning and AI in Risk Management
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
 

Similaire à Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEARNING

IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era.  IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era. Benoit Marolleau
 
The future of FinTech product using pervasive Machine Learning automation - A...
The future of FinTech product using pervasive Machine Learning automation - A...The future of FinTech product using pervasive Machine Learning automation - A...
The future of FinTech product using pervasive Machine Learning automation - A...Shift Conference
 
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?FIAT/IFTA
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPeculium Crypto
 
Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"Diego Oppenheimer
 
EDW 2015 cognitive computing panel session
EDW 2015 cognitive computing panel session EDW 2015 cognitive computing panel session
EDW 2015 cognitive computing panel session Steve Ardire
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine LearningYuriy Guts
 
Top Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwareTop Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwarePanorama Software
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the tradeFangda Wang
 
Machine Learning for Finance Master Class
Machine Learning for Finance Master Class Machine Learning for Finance Master Class
Machine Learning for Finance Master Class QuantUniversity
 
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...Dario Mangano
 
AI in Business: Opportunities & Challenges
AI in Business: Opportunities & ChallengesAI in Business: Opportunities & Challenges
AI in Business: Opportunities & ChallengesTathagat Varma
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOProduct School
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
Whats Next for Machine Learning
Whats Next for Machine LearningWhats Next for Machine Learning
Whats Next for Machine LearningOgilvy Consulting
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieSunil Ranka
 
Agile data science
Agile data scienceAgile data science
Agile data scienceJoel Horwitz
 
How Machine Learning Will Transform Finance
How Machine Learning Will Transform FinanceHow Machine Learning Will Transform Finance
How Machine Learning Will Transform FinanceRich Clayton
 

Similaire à Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEARNING (20)

IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era.  IBM i & Data Science in the AI era.
IBM i & Data Science in the AI era.
 
The future of FinTech product using pervasive Machine Learning automation - A...
The future of FinTech product using pervasive Machine Learning automation - A...The future of FinTech product using pervasive Machine Learning automation - A...
The future of FinTech product using pervasive Machine Learning automation - A...
 
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?EVAIN Artificial intelligence and semantic annotation: are you serious about it?
EVAIN Artificial intelligence and semantic annotation: are you serious about it?
 
Putting data science in your business a first utility feedback
Putting data science in your business a first utility feedbackPutting data science in your business a first utility feedback
Putting data science in your business a first utility feedback
 
Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"
 
Ezml Stanford 2015
Ezml Stanford 2015Ezml Stanford 2015
Ezml Stanford 2015
 
EDW 2015 cognitive computing panel session
EDW 2015 cognitive computing panel session EDW 2015 cognitive computing panel session
EDW 2015 cognitive computing panel session
 
Data is not the new snake oil
Data is not the new snake oilData is not the new snake oil
Data is not the new snake oil
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Top Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwareTop Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama Software
 
Data science tools of the trade
Data science tools of the tradeData science tools of the trade
Data science tools of the trade
 
Machine Learning for Finance Master Class
Machine Learning for Finance Master Class Machine Learning for Finance Master Class
Machine Learning for Finance Master Class
 
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...SDD2017 - 03 Abed Ajraou  - putting data science in your business a first uti...
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
 
AI in Business: Opportunities & Challenges
AI in Business: Opportunities & ChallengesAI in Business: Opportunities & Challenges
AI in Business: Opportunities & Challenges
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPO
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
Whats Next for Machine Learning
Whats Next for Machine LearningWhats Next for Machine Learning
Whats Next for Machine Learning
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
Agile data science
Agile data scienceAgile data science
Agile data science
 
How Machine Learning Will Transform Finance
How Machine Learning Will Transform FinanceHow Machine Learning Will Transform Finance
How Machine Learning Will Transform Finance
 

Plus de Matt Stubbs

Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability ArchitecturesBlueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability ArchitecturesMatt Stubbs
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Matt Stubbs
 
Blueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data PlatformBlueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data PlatformMatt Stubbs
 
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Matt Stubbs
 
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.Matt Stubbs
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEMatt Stubbs
 
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLBig Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLMatt Stubbs
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSMatt Stubbs
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Matt Stubbs
 
Big Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPRBig Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPRMatt Stubbs
 
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...Matt Stubbs
 
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...Matt Stubbs
 
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Matt Stubbs
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Matt Stubbs
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSMatt Stubbs
 
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSEBig Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSEMatt Stubbs
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGMatt Stubbs
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Matt Stubbs
 
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Matt Stubbs
 
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATEBig Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATEMatt Stubbs
 

Plus de Matt Stubbs (20)

Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability ArchitecturesBlueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
Blueprint Series: Banking In The Cloud – Ultra-high Reliability Architectures
 
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
Speed Up Your Apache Cassandra™ Applications: A Practical Guide to Reactive P...
 
Blueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data PlatformBlueprint Series: Expedia Partner Solutions, Data Platform
Blueprint Series: Expedia Partner Solutions, Data Platform
 
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
Blueprint Series: Architecture Patterns for Implementing Serverless Microserv...
 
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
Big Data LDN 2018: DATA, WHAT PEOPLE THINK AND WHAT YOU CAN DO TO BUILD TRUST.
 
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCEBig Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
Big Data LDN 2018: DATABASE FOR THE INSTANT EXPERIENCE
 
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQLBig Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
Big Data LDN 2018: BIG DATA TOO SLOW? SPRINKLE IN SOME NOSQL
 
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTSBig Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
Big Data LDN 2018: ENABLING DATA-DRIVEN DECISIONS WITH AUTOMATED INSIGHTS
 
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
Big Data LDN 2018: DATA MANAGEMENT AUTOMATION AND THE INFORMATION SUPPLY CHAI...
 
Big Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPRBig Data LDN 2018: AI VS. GDPR
Big Data LDN 2018: AI VS. GDPR
 
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
Big Data LDN 2018: REALISING THE PROMISE OF SELF-SERVICE ANALYTICS WITH DATA ...
 
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
Big Data LDN 2018: TURNING MULTIPLE DATA LAKES INTO A UNIFIED ANALYTIC DATA L...
 
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
Big Data LDN 2018: MICROSOFT AZURE AND CLOUDERA – FLEXIBLE CLOUD, WHATEVER TH...
 
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
 
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSEBig Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
Big Data LDN 2018: EXPERIAN: MAXIMISE EVERY OPPORTUNITY IN THE BIG DATA UNIVERSE
 
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNINGBig Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
Big Data LDN 2018: A LOOK INSIDE APPLIED MACHINE LEARNING
 
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
Big Data LDN 2018: DEUTSCHE BANK: THE PATH TO AUTOMATION IN A HIGHLY REGULATE...
 
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
Big Data LDN 2018: FROM PROLIFERATION TO PRODUCTIVITY: MACHINE LEARNING DATA ...
 
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATEBig Data LDN 2018: DATA APIS DON’T DISCRIMINATE
Big Data LDN 2018: DATA APIS DON’T DISCRIMINATE
 

Dernier

Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 

Dernier (16)

Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 

Big Data LDN 2018: HOW AUTOMATION CAN ACCELERATE THE DELIVERY OF MACHINE LEARNING

  • 1. Big Data LDN 2018 London | November 14th 2018 André Balleyguier Chief Data Scientist EMEA at DataRobot How automation can accelerate the delivery of Machine Learning
  • 2. © DataRobot, Inc. All rights reserved. Who am I? Data Scientist London Previously... @andrebalg
  • 3. © DataRobot, Inc. All rights reserved. Talk Outline 1. Why do so many Machine Learning projects fail? 2. Automated ML: a new trend to address the bottlenecks? 3. How can Automated ML help?
  • 4. © DataRobot, Inc. All rights reserved. Beyond the buzz... “Machine Learning is a core, transformative way by which we’re rethinking how we’re doing everything.” Sundar Pichai, CEO Google, 2015
  • 5. © DataRobot, Inc. All rights reserved. … the Reality! “80% of Data Science projects never go to production!” Some (a lot) of my customers Most people in the field? “We mainly build prototypes”
  • 6. © DataRobot, Inc. All rights reserved. So many tools to make it easier!... ● Large communities ● Tools more accessible than ever ● Open-source driven abstraction of ML Keras R Caret CatBoost PaddlePaddle LightGBM FastText Released in the last year
  • 7. © DataRobot, Inc. All rights reserved. … So why is it so hard?
  • 8. © DataRobot, Inc. All rights reserved. The Machine Learning Life Cycle Discovery & Problem definition Data stuff Source Identification Munging Wrangling Ninja-ing... Business problem Value statement Formulation into ML Define usage Model development Socialisation Operationalise Pre-processing Feature engineering Parameter tuning Diagnostics Validation Model selection Derive insights Communicate results Quantify ROI Deploy into production Documentation Monitoring Maintenance In theory, a Machine Learning project is a simple iterative flow:
  • 9. © DataRobot, Inc. All rights reserved. 1) Data munging, our favorite worst nightmare Data stuff Model development Socialisation Operationalise Discovery & Problem definition Data cleansing first, then modelling No iterative loop? “Blind Data Prep”
  • 10. © DataRobot, Inc. All rights reserved. 2) Deployment into “production” Discovery & Problem definition Data stuff Model development Socialisation Operationalise Initially on-board. then... Then runs away… ➔ Cost of deployment is too high, low budget or low ROI “At Microsoft, 8 of the 16 data scientists interviewed work or manage the operationalization of predictive models” Microsoft Research, 2015
  • 11. © DataRobot, Inc. All rights reserved. 3) The Data Science Hype “Machine Learning development is like the raisins in a raisin bread: 1. You need the bread first 2. It’s just a few tiny raisins but without it you would just have plain bread.” Peter Norvig Red flags to be careful about: 1. Deceitful hype “Let’s analyse all our images with Deep Semantic Context-Aware Neural Nets to refine our customer journey!!!” 2. No clear and measurable business objective “This dataset looks interesting, let’s analyse it!” “Most of the Economic value generated today utilizes supervised learning on structured data, but unstructured stories make for better PR.” Andrew Ng, Oct 6th 2017
  • 12. © DataRobot, Inc. All rights reserved. 4) Lack of business buy-in after prototype Discovery & Problem definition Data stuff Model development Socialisation Operationalise Initially on-board. Hype? then... Then runs away… ➔ No buy-in, low understanding
  • 13. © DataRobot, Inc. All rights reserved. 5) Wanted: Experienced Data Scientist ➔ Hard to find ➔ Hard to retain ➔ Highly-paid Not scalable for multiple end-to-end projects
  • 14. © DataRobot, Inc. All rights reserved. To summarise... Common bottlenecks for scaling Machine Learning: 1. Data cleansing is time-consuming and inefficient 2. Cost of model deployment is high 3. Deciphering the hype is not easy 4. Lack of business buy-in after the prototype stage 5. Skilled data scientists are necessary, but hard to find/retain How to address those issues?
  • 16. © DataRobot, Inc. All rights reserved. AutoML: A growing trend May 2017: ● Announcement of Google AutoML at the Google I/O in May 2017
  • 17. © DataRobot, Inc. All rights reserved. What is AutoML? “AutoML is the automated process of algorithm selection, hyperparameter tuning, iterative modeling, and model assessment.” Model selection & tuning Often based on search or genetic algorithms to efficiently select Workflow automation + Model Selection Also automates model analysis, feature engineering, deployment... Task specific Google AutoML Deep Learning structures Auto-Weka Will be based more and more on metalearning: using ML to build ML!
  • 18. © DataRobot, Inc. All rights reserved. Where can you automate Machine Learning? Discovery & Problem definition Data stuff Source Identification Munging Wrangling Ninja-ing... Business problem Value statement Formulation into ML Define usage Model development Socialisation Operationalise Pre-processing Feature engineering Parameter tuning Diagnostics Validation Model selection Derive insights Communicate results Quantify ROI Deploy to production Documentation Monitoring Maintenance
  • 19. © DataRobot, Inc. All rights reserved. The power of iterations Discovery & Problem definition Data stuff Source Identification Munging Wrangling Ninja-ing... Business problem Value statement Formulation into ML Define usage Model Socialisation Operationalise Pre-processing Feature engineering Parameter tuning Diagnostics Validation Model selection Derive insights Communicate results Quantify ROI Deploy to production Documentation Monitoring Maintenance Spend less time on repetitive tasks: ➔ More iterations = better exploration of the problem, more useful models ➔ Faster iterations = more regular feedback from business
  • 20. © DataRobot, Inc. All rights reserved. How does automation help with data prep? “Blind Data Prep” “Informed Data Prep” Automation can reduce the time to build models, diagnose a model and derive insights: Proposed approach: 1. Build first model quickly with only a few key data columns 2. Use the insights, diagnostics and learning to rectify model and pre-process data accordingly 3. Iterate efficiently, learn from your mistakes rapidly
  • 21. © DataRobot, Inc. All rights reserved. Model deployment and maintenance Workflow automation + Microservices Automating model development allows: 1. Model refresh automation 2. More efficient monitoring of dataset shift (model drift) 3. Creation of end-to-end workflows with ETL pipelines Generic abstraction of every model: each model can be controlled and deployed as a prediction microservice. ➔ REST APIs ➔ Generic applications (e.g Spark) ➔ Portable code generation “Prediction Store” “Model Factory”
  • 22. © DataRobot, Inc. All rights reserved. AutoML helping with the hype ➔ Fail fast to increase opportunities and focus on the projects with measurable impact ➔ Enable more business users to use ML to identify the right use cases
  • 23. © DataRobot, Inc. All rights reserved. Model socialisation Faster iteration cycle = More regular feedback from the stakeholders = More interventions from the business = Better stickiness and buy-in This approach fits perfectly into an “Agile Data Science” methodology.
  • 24. © DataRobot, Inc. All rights reserved. The Data Science experience problem 100’s of models deployed 10 models deployed Automation Without automation
  • 25. © DataRobot, Inc. All rights reserved. Will AutoML take my job? NO! ● Machine Learning is task-oriented. Need an individual to craft the task, understand the data, the domain... ● Reduces the time spent on “grunt work” ● Paradigm shift: closer to the business, closer to the actual value.
  • 26. © DataRobot, Inc. All rights reserved. Let’s embrace automation!