SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Machine Learning
Big picture
Francis Pieraut – Oct 2016
My startup bias about efficiency
Plan
1. My path in ML
2. AI big picture (expert systems to ML)
3. ML trends over time 1980-2008+
4. Type of ML (supervised vs unsupervised)
5. Relationship Data-mining vs ML
6. Training process
7. regularization technique
8. ML research big picture
9. What is this Deeplearning Revolution?
10. ML in practice -> feature engineering
11. Importance of the cost function
12. Data importance -> NIPS 2009
13. The tagging nightmare
14. ML & Optimization
15. Adversarial Examples
Francis Evolution in ML
• 1999 – Decision Tree expert (Samy Bengio)
• 2001-2003 – Research with Bengio (huge networks) -> flayers
• 2003 – Idilia -> Importance of good tagged dataset and features &
overfitting
• 2005-2006 – Dakis -> KISS (ML not required & importance of
comprehensive knowledge) – Expert System
• 2006-2009 – Data-Mining (Understand first & features extraction)…MLboost
• 2010-2013 – QMining -> big-data mining
• 2003-2016 – Nuance -> Data Maturity & Data-driven design
Data Maturity model reminder
AI big picture
Type of ML
Parametric Non-Parametric
Reinforcement
ML trends over time 1980-2008+
http://fraka6.blogspot.com/2013/10/deep-learning-history-and-most.html
10 main ML algo
• Naïve Bayes Classifier Algorithm
• K Means Clustering Algorithm
• Support Vector Machine Algorithm
• Apriori Algorithm
• Linear Regression
• Logistic Regression
• Artificial Neural Networks (gradient)
• Random Forests
• Decision Trees (info theory)
• K Nearest Neighbors
***Machine learning
dangerous hype****
Traps and Pitfalls
Data-Mining vs Machine Learning
Traininig Process
Classification error over time
Training
Regularisation technique
• Regularization is a technique used in an
attempt to solve the overfitting [1]
problem
in statistical models.*
• Exemple:
– Early stopping
– Decrease constant
– Dropout
– Mini-batch
– Better cost function (ex: margin vs MSE)
What is tough about ML
• More parameters = more examples are
required
• Tagged data is hard to create compare to
untagged data
• There is no magic -> Feature engineering
• Better features -> less examples -> less
capacity problem
• Getting good example sampling (don’t
introduce bias)
Example feature engineering
Example feature engineering
What is this Deeplearning Revolution?
• Deep architecture are more powerful then shallow
architecture
• Before 2006 we couldn’t train deep architecture
• Revolution
– Convolution NN
– Train generative models (Auto-encoder) -> learn the
data constraints…..unsupervised learning… (better
parameters initialization)
– STD Training
Example of deep learning in images
ML learning in practice
• Black box = recipe for a disaster
• 90% feature engineering
• ML = automatic tuning
• Garbage in = Garbage out
• Tagging is a pain….manual work
Importance of the cost function
• Neural network cost functions (back prop)
– MSE & Log soft max
– Example NETFLIX & recommendation
• Optimization
– SVM = Maximize Margin
Data Importance NIPS 2009
• Google -> that is enough
– Parameter optimization; tweaking kernels (SVM)
– More parameters then # examples
– Simpler model + more data = what works
The tagging nightmare
• You still need tagged data
• Tagged data is hard to automate + error
prone
• Tagged data is error prone (garbage in
garbage out)
– Idilia use case
– Nuance use case
The lie about ML
• Machine learning != Optimization
• Machine learning != Statistics
• Machine learning = Optimization problem
with constraints to generalize
(regularization)
Adversarial example - Ian Goodfellow
(now at open.ai)
Conclusion
• ML is a quite mature field
• ML != Deeplearning
– Deeplearning = major breakthrough, hype
phase, not mature
• NN = optimization problem with constraints
• SP operates more like expert systems
• Algo is as good as its inputs -> feature
engineering
QUESTIONS
francis@qmining.com
hum...

Contenu connexe

Similaire à ML_big_picture-2.0.pptx

Storage Challenges for Production Machine Learning
Storage Challenges for Production Machine LearningStorage Challenges for Production Machine Learning
Storage Challenges for Production Machine LearningNisha Talagala
 
machine learning workflow with data input.pptx
machine learning workflow with data input.pptxmachine learning workflow with data input.pptx
machine learning workflow with data input.pptxjasontseng19
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?Ivo Andreev
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureIvo Andreev
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningEdunomica
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2Mohit Garg
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learningIvo Andreev
 
Quest for machine intelligence: Statistical learning methods
Quest for machine intelligence: Statistical learning methodsQuest for machine intelligence: Statistical learning methods
Quest for machine intelligence: Statistical learning methodsPavel Loskot
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMProduct School
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZCharles Vestur
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsGanesan Narayanasamy
 
artificial intelligence.pptx
artificial intelligence.pptxartificial intelligence.pptx
artificial intelligence.pptxrithika858339
 
M.tech cse 10july13 (1)
M.tech cse  10july13 (1)M.tech cse  10july13 (1)
M.tech cse 10july13 (1)vijay707070
 
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...Knobbe Martens - Intellectual Property Law
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software EngineeringMiroslaw Staron
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Venkata Reddy Konasani
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedLaurenz Wuttke
 

Similaire à ML_big_picture-2.0.pptx (20)

MLIntro_ADA.pptx
MLIntro_ADA.pptxMLIntro_ADA.pptx
MLIntro_ADA.pptx
 
Storage Challenges for Production Machine Learning
Storage Challenges for Production Machine LearningStorage Challenges for Production Machine Learning
Storage Challenges for Production Machine Learning
 
machine learning workflow with data input.pptx
machine learning workflow with data input.pptxmachine learning workflow with data input.pptx
machine learning workflow with data input.pptx
 
The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?The Data Science Process - Do we need it and how to apply?
The Data Science Process - Do we need it and how to apply?
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Prepare your data for machine learning
Prepare your data for machine learningPrepare your data for machine learning
Prepare your data for machine learning
 
Quest for machine intelligence: Statistical learning methods
Quest for machine intelligence: Statistical learning methodsQuest for machine intelligence: Statistical learning methods
Quest for machine intelligence: Statistical learning methods
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PM
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to Z
 
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systemsTraditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
Traditional Machine Learning and Deep Learning on OpenPOWER/POWER systems
 
artificial intelligence.pptx
artificial intelligence.pptxartificial intelligence.pptx
artificial intelligence.pptx
 
M.tech cse 10july13 (1)
M.tech cse  10july13 (1)M.tech cse  10july13 (1)
M.tech cse 10july13 (1)
 
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
Understanding and Protecting Artificial Intelligence Technology (Machine Lear...
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Msst 2019 v4
Msst 2019 v4Msst 2019 v4
Msst 2019 v4
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons Learned
 

Plus de Francis Piéraut

4th industrial revolution fuel by combining big data and deeplearning a qui...
4th industrial revolution fuel by combining big data and deeplearning   a qui...4th industrial revolution fuel by combining big data and deeplearning   a qui...
4th industrial revolution fuel by combining big data and deeplearning a qui...Francis Piéraut
 
Startups ultime experience
Startups ultime experienceStartups ultime experience
Startups ultime experienceFrancis Piéraut
 
The ultimate trick to learn faster
The ultimate trick  to learn fasterThe ultimate trick  to learn faster
The ultimate trick to learn fasterFrancis Piéraut
 
Big data barrier of entry (flash)
Big data barrier of entry (flash) Big data barrier of entry (flash)
Big data barrier of entry (flash) Francis Piéraut
 
The big data dead valley dilemma and much more.
The big data dead valley dilemma and much more.The big data dead valley dilemma and much more.
The big data dead valley dilemma and much more.Francis Piéraut
 
Appengine vs Amazon; pros & cons for startups
Appengine vs Amazon; pros & cons for startupsAppengine vs Amazon; pros & cons for startups
Appengine vs Amazon; pros & cons for startupsFrancis Piéraut
 
No BI without Machine Learning
No BI without Machine LearningNo BI without Machine Learning
No BI without Machine LearningFrancis Piéraut
 
Master Defense Slides (translated)
Master Defense Slides (translated)Master Defense Slides (translated)
Master Defense Slides (translated)Francis Piéraut
 

Plus de Francis Piéraut (10)

4th industrial revolution fuel by combining big data and deeplearning a qui...
4th industrial revolution fuel by combining big data and deeplearning   a qui...4th industrial revolution fuel by combining big data and deeplearning   a qui...
4th industrial revolution fuel by combining big data and deeplearning a qui...
 
Startups ultime experience
Startups ultime experienceStartups ultime experience
Startups ultime experience
 
The ultimate trick to learn faster
The ultimate trick  to learn fasterThe ultimate trick  to learn faster
The ultimate trick to learn faster
 
Big data barrier of entry (flash)
Big data barrier of entry (flash) Big data barrier of entry (flash)
Big data barrier of entry (flash)
 
Big data trap
Big data trapBig data trap
Big data trap
 
The big data dead valley dilemma and much more.
The big data dead valley dilemma and much more.The big data dead valley dilemma and much more.
The big data dead valley dilemma and much more.
 
Appengine vs Amazon; pros & cons for startups
Appengine vs Amazon; pros & cons for startupsAppengine vs Amazon; pros & cons for startups
Appengine vs Amazon; pros & cons for startups
 
No BI without Machine Learning
No BI without Machine LearningNo BI without Machine Learning
No BI without Machine Learning
 
Java Empowered by Jython
Java Empowered by JythonJava Empowered by Jython
Java Empowered by Jython
 
Master Defense Slides (translated)
Master Defense Slides (translated)Master Defense Slides (translated)
Master Defense Slides (translated)
 

ML_big_picture-2.0.pptx

  • 2.
  • 3.
  • 4.
  • 5. My startup bias about efficiency
  • 6. Plan 1. My path in ML 2. AI big picture (expert systems to ML) 3. ML trends over time 1980-2008+ 4. Type of ML (supervised vs unsupervised) 5. Relationship Data-mining vs ML 6. Training process 7. regularization technique 8. ML research big picture 9. What is this Deeplearning Revolution? 10. ML in practice -> feature engineering 11. Importance of the cost function 12. Data importance -> NIPS 2009 13. The tagging nightmare 14. ML & Optimization 15. Adversarial Examples
  • 7. Francis Evolution in ML • 1999 – Decision Tree expert (Samy Bengio) • 2001-2003 – Research with Bengio (huge networks) -> flayers • 2003 – Idilia -> Importance of good tagged dataset and features & overfitting • 2005-2006 – Dakis -> KISS (ML not required & importance of comprehensive knowledge) – Expert System • 2006-2009 – Data-Mining (Understand first & features extraction)…MLboost • 2010-2013 – QMining -> big-data mining • 2003-2016 – Nuance -> Data Maturity & Data-driven design
  • 8.
  • 11. Type of ML Parametric Non-Parametric Reinforcement
  • 12. ML trends over time 1980-2008+ http://fraka6.blogspot.com/2013/10/deep-learning-history-and-most.html
  • 13. 10 main ML algo • Naïve Bayes Classifier Algorithm • K Means Clustering Algorithm • Support Vector Machine Algorithm • Apriori Algorithm • Linear Regression • Logistic Regression • Artificial Neural Networks (gradient) • Random Forests • Decision Trees (info theory) • K Nearest Neighbors ***Machine learning dangerous hype****
  • 17. Regularisation technique • Regularization is a technique used in an attempt to solve the overfitting [1] problem in statistical models.* • Exemple: – Early stopping – Decrease constant – Dropout – Mini-batch – Better cost function (ex: margin vs MSE)
  • 18. What is tough about ML • More parameters = more examples are required • Tagged data is hard to create compare to untagged data • There is no magic -> Feature engineering • Better features -> less examples -> less capacity problem • Getting good example sampling (don’t introduce bias)
  • 21. What is this Deeplearning Revolution? • Deep architecture are more powerful then shallow architecture • Before 2006 we couldn’t train deep architecture • Revolution – Convolution NN – Train generative models (Auto-encoder) -> learn the data constraints…..unsupervised learning… (better parameters initialization) – STD Training
  • 22.
  • 23.
  • 24. Example of deep learning in images
  • 25. ML learning in practice • Black box = recipe for a disaster • 90% feature engineering • ML = automatic tuning • Garbage in = Garbage out • Tagging is a pain….manual work
  • 26. Importance of the cost function • Neural network cost functions (back prop) – MSE & Log soft max – Example NETFLIX & recommendation • Optimization – SVM = Maximize Margin
  • 27. Data Importance NIPS 2009 • Google -> that is enough – Parameter optimization; tweaking kernels (SVM) – More parameters then # examples – Simpler model + more data = what works
  • 28. The tagging nightmare • You still need tagged data • Tagged data is hard to automate + error prone • Tagged data is error prone (garbage in garbage out) – Idilia use case – Nuance use case
  • 29. The lie about ML • Machine learning != Optimization • Machine learning != Statistics • Machine learning = Optimization problem with constraints to generalize (regularization)
  • 30. Adversarial example - Ian Goodfellow (now at open.ai)
  • 31. Conclusion • ML is a quite mature field • ML != Deeplearning – Deeplearning = major breakthrough, hype phase, not mature • NN = optimization problem with constraints • SP operates more like expert systems • Algo is as good as its inputs -> feature engineering