Based on Gartner's research, 85% of AI projects fail. In this talk, we show the most common mistakes made by the managers, developers, and data scientists while building AI products. We go through ten case studies of products that failed and analyze the reasons for each failure. We also present how to avoid such mistakes and deliver a successful AI product by introducing a few lifecycle changes.
2. Dr. Karol Przystalski
Overview
2015 - obtained a Ph.D. in Computer Science @ Jagiellonian University
2010 until now - CTO @ Codete
2007 - 2009 - Software Engineer @ IBM
Contact
karol@codete.com
Recent research papers
Multispectral skin patterns analysis using fractal methods
K. Przystalski and M. J.Ogorzalek. Expert Systems with Applications, 2017
https://www.sciencedirect.com/science/article/pii/S0957417417304803
6. Buzzword-driven machine learning projects - some stats
of “AI startups” in Europe don’t actually use AI
The State of AI 2019: Divergence
40%
7. Buzzword-driven machine learning projects - some stats
Startups labelled as being in AI attract 15% to more funding
than other technology firms.
The State of AI 2019: Divergence
50%
8. Buzzword-driven machine learning projects - some stats
Based on the recent Gartner research, of AI projects fails.
Gartner, 2019
85%
9. Introduction to Machine Learning - use cases
Machine learning (or AI) has many use cases in the process automation in the fields like:
● security,
● medical diagnosis,
● customer service,
● financial analytics - i.e. risk management, insurance prediction,
● blockchain,
● self-driving cars,
● test automation,
● and many more.
10. Introduction to Machine Learning - AI vs. ML
Artificial
Intelligence
Machine
Learning
Deep
Learning
11. Introduction to Machine Learning - Security
Traditional
Programming
Machine
Learning
Rules
Data
Answers
Data
Answers
Rules
12. Introduction to Machine Learning - a basic taxonomy
1. Is this A or B? Classification
2. Is this weird? Anomaly detection
3. How much / how many ? Regression
4. How is this organized? Clustering
5. What should I do next? Reinforcement learning
13. A
Introduction to Machine Learning - AGI vs. ANI
Artificial
Inteligence
Artificial General
Inteligence
Artificial Narrow
Inteligence
14. Buzzword-driven machine learning projects - AI vs. ML
If it's written in PowerPoint, it is
denitely Articial Intelligence.
However, if it's written in
Python/R/Scala, it is probably
Machine Learning. ML is just
one of the attempts to achieve
AI the best we currently have,
but surely not good enough to
reach it at any point.
Many forms of Government
have been tried, and will be tried
in this world of sin and woe. No
one pretends that democracy is
perfect or all-wise. Indeed it has
been said that democracy is the
worst form of Government
except for all those other forms
that have been tried from time
to time.
33. Buzzword-driven machine learning projects - Buzzwords
Machine learning became a buzzword a few years ago. Like deep learning, blockchain or
data science, each buzzword is often used by startup to show the innovative approach.
There are many projects/challenges where machine learning shouldn't be the solution or
at least shouldn't be the first choice.
36. Common mistakes and failures - Fail #1 Don’t follow the hype
Many companies apply Machine Learning where it shouldn’t be used. There are usually
many ways to solve a challenge. Machine learning is in most cases the one that should be
considered as the last option, because there are simpler solutions available.
Avoid The hype
Lessons learned Follow the fail fast approach.
37. Common mistakes and failures - A story about a girl and a wolf
Actors:
● Sweet little girl
● Her grandmother
● Big bad wolf
Equipment:
● Little red cap
● Basket
Goal:
● Deliver a piece of cake and a bottle of wine
Image: Designed by vectorpocket / Freepik
38. Common mistakes and failures - A story about a girl and a wolf
Image: Designed by vectorpocket / Freepik
How to distinguish between a grandma and a wolf?
39. Common mistakes and failures - Fail #2 Overfit is an evil!
There is a well-known example of a Machine Learning system designed for classifying the
images of wolves and huskies.
~90%
Accuracy:
40. Common mistakes and failures - Fail #3 AI company without AI strategy
A company using machine learning methods doesn’t make your company
an AI company.
There are many challenges that needs to be fulfilled to become an AI
company. Just to point out a few challenges:
● data acquisition strategy,
● unified data storage,
● pervasive automation,
● setup new data roles structure.
41. Common mistakes and failures - Fail #4 Building valuable solution, not nice
Always think about business value of the solution you want to
develop. There are plenty of solutions/ideas that does not fix any
challenge and/or doesn’t give any added value.
42. Common mistakes and failures - Fail #5 Use proper process
Remember that ML/DS projects are research projects. This makes the process
a bit different compared to typical software development process. Start with a
PoC instead of a production version. Perform due diligence before stepping
into a ML/DS projects.
43. Common mistakes and failures - Fail #6 Use proper test data
One of the most popular issue related to machine learning models training is overfiting.
44. Common mistakes and failures - Fail #7 Build the right team
Finding the right team members isn’t easy, especially for companies without a
data science team. Especially that many ML/DS projects are research projects
that should be treated as research project, not software development
projects. There are several solutions:
● find a data scientists with prooven projects implementations,
● find a company with experience in data science to help in the
recruitment process.
45. Common mistakes and failures - Fail #8 Don’t reinvent the wheel
There are plenty of machine learning methods and many implementations
that are available as open source. Don’t start with deep learning methods if
you are sure it’s a valuable solution, better then most shallow methods. Use
tools like AutoML to find the possible best method.
46. Common mistakes and failures - Fail #9 Proper hardware solution
Online solutions are cheap on the first look, but you need to be aware to
close the research when it takes too long.
GPUs are not cheaper than TPUs in most cases, even the GPU is cheaper then
TPUs.
47. Common mistakes and failures - Fail #10 Focus on the security
The retraining part of the machine learning process is a good approach, but
without some additional supporting tools it can be a fail.
48. Common mistakes and failures - Business case #1 A travel company
Goal. Build a bot to sum information from emails and send it to a database
for further processing.
Proposed solution. A NLP method was proposed as an idea, but we have
figured out fast that a better solution are regular expressions.
Fail: The company obtained a financing from the government. The application
was about a ML implementation and the comany was forced to use a ML
solution.
49. Common mistakes and failures - Business case #2 A bank
Goal. Build a security solution based on pattern recognition.
Solution. The idea was to build a PoC first. A good approach, but we were
asked for help when in the 8th month of the PoC development. Are solution:
just stop.
Fail: A PoC shouldn’t take more than three weeks, if it takes longer there is
something wrong. In this case there were a few reasons of this fail, but the
biggest one was the lack of experience of the management and data
scientists team.
50. Common mistakes and failures - Business case #3 A chemical corporation
Goal. Automate the process of new nutritions discovery.
Proposed solution. Use machine learning to automate some formulas that
were to find the nutrition and reduce the costs of the production.
Fail: The data set that the company has was not ready for machine learning.
We proposed a data preprocessing workshop to clean the data and extract
the features.
51. Common mistakes and failures - Business case #4 A bank again
Goal. Build a chatbot that will help/extend the customer service.
Proposed solution. A NLP method that was used by the customer allowed to
build a rule-based chatbots. It was planned to build a retrieval-based chatbot.
We have helped the client to extend the current the solution using
well-known tools for retrieval-based chatbots.
Fail: Lack of knowledge in the NLP area.
52. Common mistakes and failures - Business case #5 Car manufacture
Goal. Build a machine learning solution that will do sales analysis based on
the the car details, especially positions (locations).
Fail. The business part of the due diligence was done wrong, because after 9
months of development we figured out that the location data cannot be
moved outside of China and the solution was developed in Germany and
couldn’t be moved to China too.
61. AI transformation - transformation steps
Based on Andrew Ng recommendations, the AI transformation can be divided
into several steps:
● build several PoC projects,
● build your AI team,
● provide AI trainings,
● develop AI strategy,
● build external and internal communication.
62. AI transformation - ML PoC process
It’s hard to combine data science projects into a Scrum model. There are
many problems that needs to be solved, one of the most problematic is to
divide the tasks properly. Properly means:
● avoid setting tasks where we use fixed values of quality metrics,
● use specific metrics, usually more just one, avoid using accuracy,
● divide tasks into data acquisition, preprocessing, model strategy, and
quality metrics.
Perform due diligence before stepping into a ML/DS projects.
This applies to PoC, MVP, and production solutions!
64. AI transformation - due diligence
We can divide due diligence into technical and business part. The technical team should
answer the questions:
● Is it possible to solve this challenge with ML?
● How much and what kind of data is needed?
● When should it be delivered and do we have capacity to deliver it?
During the business part of the due diligence we should answer the following questions:
● Will the solution reduce the costs and/or increase revenues?
● Will we lunch a new product or new business/service?
65. Data Science team structure - roles
At Netflix there are many roles related to data:
● Business Analyst,
● Data Analyst,
● Quantitative Analyst,
● Algorithm Engineer,
● Analytics Engineer,
● Data Engineer,
● Data Scientist,
● Machine Learning Scientist,
● Research Scientist.