SlideShare une entreprise Scribd logo
1  sur  22
Machine Learning 101
Fred Verheul
Machine Learning
"Field of study that gives computers the ability to learn
without being explicitly programmed” (Arthur Samuel, 1959)
2
What is Machine Learning?
3
Computer
Computer
Traditional Programming
Machine Learning
Data
Data
Program
Output
Program
Output
Prediction is hard…
4
Sweet spot for Machine Learning
• It’s impossible to write down the rules in code:
• Too many rules
• Too many factors influencing the rules
• Too finely tuned
• We just don’t know the rules (image recognition)
• Lots of labeled data (examples) available (e.g. historical data)
5
Basic Machine Learning ‘workflow’
6
Feature
Vectors
Training
data
Labels
Machine
Learning
Algorithm
Feature
Vectors
New data Prediction
Training Phase
Operational Phase
Predictive
Model
Training Phase in more detail
7
Raw data
Data
preparation Feature
Vectors
Training
Data
Test
data
Model Building
(by ML
algorithm)
Model
Evaluation
Predictive
Model
Feedback loop
data cleansing
data transformation
normalization
feature extraction
aka
‘learning’
Examples of ML tasks
Supervised learning
Regression 
target is numeric
Classification 
target is categorical
8
Unsupervised learning
Clustering
Dimensionality
reduction
Modeling: so many algorithms…
9
ML Algorithms: by Representation
Collection of candidate models/programs, aka hypothesis space
10
Decision trees
Instance-based
Neural networks
Model ensembles
ML Algorithms: by Evaluation
Evaluation: Quality measure for a model
11
Regression
Example metric: Root Mean Squared Error
RMSE =
Binary classification: confusion matrix
Accuracy: 8 + 971 -> 97,9%
Example: medical test
for a disease
Positive Negative
P
True
positives
TP
False
Negatives
FN
N
False
positives
FP
True
Negatives
TN
True
Class
Predicted class
Accuracy: Better evaluation metrics:
• Precision: 8 / (8 + 19)
• Recall: 8 / (8 + 2)
Optimization: how the algorithm ‘learns’, depends on representation and
evaluation
ML Algorithms: by Optimization
12
Greedy Search,
ex. of
combinatorial
optimization
Gradient Descent (or in general: Convex Optimization)
Linear Programming (or in general:
Constrained/Nonlinear Optimization)
Training error vs test error
13
Data Science for Business
• Focuses more on general principles
than specific algorithms
• Not math-heavy, does contain some
math
• O’Reilly link:
http://shop.oreilly.com/product/063692
0028918.do
• Book website: http://data-science-for-
biz.com/DSB/Home.html
14
What has NOT been covered (1)
• Deep learning / Neural Networks
• Covered in other presentations at DKOM
• Also recommended for further reading (deep dive):
• http://neuralnetworksanddeeplearning.com/index.html
• Specifics of ML-algorithms
• All over the internet… e.g. at http://machinelearningmastery.com/
15
What has NOT been covered (2)
• Libraries (examples):
• Tensorflow, Caffe, Theano, Keras
• SciPy & scikit-learn
• Spark MLLib (Scala/Java/Python)
• Programming languages:
16
What has NOT been covered (3)
• SAP products:
• SAP HANA, SAP HANA Vora, SAP
BO Predictive Analytics(!), HCP
Predictive Services
• New machine learning platform
• Hardware
• Nvidia talk about GPUs
17
What has NOT been covered (4)
• Ethics and algorithmic
transparency:
18
What has NOT been covered (5)
• The Data Science &
Data Mining Process:
19
What has NOT been covered (6)
• How to integrate ML into your business
application
• I hope SAP is figuring that out as we speak ;-)
• Have a look at SAP Predictive Analytics Integrator
• https://help.sap.com/pai
20
Take-aways
• Goal of ML: generalize from training data (not optimization!!)
• No magic! Just some clever algorithms…
• Increasingly important non-technical aspects:
• Ethics
• Algorithmic transparency
21
Thank You
www.soapeople.com
info@soapeople.com
@SOAPEOPLE
Fred Verheul
Big Data Consultant
+31 6 3919 2986
fred.verheul@soapeople.com

Contenu connexe

Tendances

OSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine LearningOSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine Learning
Paco Nathan
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
Justin Basilico
 

Tendances (20)

Debugging machine-learning
Debugging machine-learningDebugging machine-learning
Debugging machine-learning
 
Microsoft azure machine learning
Microsoft azure machine learningMicrosoft azure machine learning
Microsoft azure machine learning
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Data Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area MLData Workflows for Machine Learning - SF Bay Area ML
Data Workflows for Machine Learning - SF Bay Area ML
 
OSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine LearningOSCON 2014: Data Workflows for Machine Learning
OSCON 2014: Data Workflows for Machine Learning
 
Azure Machine Learning and ML on Premises
Azure Machine Learning and ML on PremisesAzure Machine Learning and ML on Premises
Azure Machine Learning and ML on Premises
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Machine Learning Goes Production
Machine Learning Goes ProductionMachine Learning Goes Production
Machine Learning Goes Production
 
Exposé Ontology
Exposé OntologyExposé Ontology
Exposé Ontology
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle Competitions
 
Ferruzza g automl deck
Ferruzza g   automl deckFerruzza g   automl deck
Ferruzza g automl deck
 
OpenML 2019
OpenML 2019OpenML 2019
OpenML 2019
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Machine Learning for Everyone
Machine Learning for EveryoneMachine Learning for Everyone
Machine Learning for Everyone
 
Square's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong YanSquare's Machine Learning Infrastructure and Applications - Rong Yan
Square's Machine Learning Infrastructure and Applications - Rong Yan
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Learning how to learn
Learning how to learnLearning how to learn
Learning how to learn
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 

En vedette

En vedette (16)

SAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP Ecosystem
SAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP EcosystemSAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP Ecosystem
SAP d-kom 2017 Karlsruhe - The Challenge of the „two speed“ SAP Ecosystem
 
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function ModelerSAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
SAP HANA SPS10- Predictive Analysis Library and Application Function Modeler
 
What's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 PredictiveWhat's New in SAP HANA SPS 11 Predictive
What's New in SAP HANA SPS 11 Predictive
 
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AGSap Executive Keynote   Dr. Wieland Schreiner, EVP - SAP AG
Sap Executive Keynote Dr. Wieland Schreiner, EVP - SAP AG
 
Machine Learning, hype or hit?
Machine Learning, hype or hit?Machine Learning, hype or hit?
Machine Learning, hype or hit?
 
Intel's Machine Learning Strategy
Intel's Machine Learning StrategyIntel's Machine Learning Strategy
Intel's Machine Learning Strategy
 
SAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas StarkeSAP Marketing Runs Hybris Marketing By Andreas Starke
SAP Marketing Runs Hybris Marketing By Andreas Starke
 
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and SparkReal-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
Real-Time Supply Chain Analytics with Machine Learning, Kafka, and Spark
 
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
#asksap Analytics Innovations Community Call - Take Action in 2017 with Innov...
 
Big Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of ThingsBig Data Analytics for the Industrial Internet of Things
Big Data Analytics for the Industrial Internet of Things
 
SAP Organization Structure
SAP Organization StructureSAP Organization Structure
SAP Organization Structure
 
Scrum! But ... SAP Inside Track Frankfurt 2017
Scrum! But ... SAP Inside Track Frankfurt 2017Scrum! But ... SAP Inside Track Frankfurt 2017
Scrum! But ... SAP Inside Track Frankfurt 2017
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
 

Similaire à Machine learning 101 dkom 2017

MachineLearningSparkML AI and expert Systems
MachineLearningSparkML AI and expert SystemsMachineLearningSparkML AI and expert Systems
MachineLearningSparkML AI and expert Systems
shreenathji26
 

Similaire à Machine learning 101 dkom 2017 (20)

MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptx
 
MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptx
 
MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptx
 
MachineLearningSparkML AI and expert Systems
MachineLearningSparkML AI and expert SystemsMachineLearningSparkML AI and expert Systems
MachineLearningSparkML AI and expert Systems
 
MachineLearningSparkML.pptx
MachineLearningSparkML.pptxMachineLearningSparkML.pptx
MachineLearningSparkML.pptx
 
Making Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons LearnedMaking Data Science Scalable - 5 Lessons Learned
Making Data Science Scalable - 5 Lessons Learned
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
 
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
The Art of Intelligence – A Practical Introduction Machine Learning for Orac...
 
(Faiz) MachineLearning(ppt).pptx
(Faiz) MachineLearning(ppt).pptx(Faiz) MachineLearning(ppt).pptx
(Faiz) MachineLearning(ppt).pptx
 
Ideas spracklen-final
Ideas spracklen-finalIdeas spracklen-final
Ideas spracklen-final
 
Module III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptxModule III MachineLearningSparkML.pptx
Module III MachineLearningSparkML.pptx
 
Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
 
Walk through of azure machine learning studio new features
Walk through of azure machine learning studio new featuresWalk through of azure machine learning studio new features
Walk through of azure machine learning studio new features
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
Ml based detection of users anomaly activities (20th OWASP Night Tokyo, English)
 
Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016
 

Dernier

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Dernier (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

Machine learning 101 dkom 2017

  • 2. Machine Learning "Field of study that gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959) 2
  • 3. What is Machine Learning? 3 Computer Computer Traditional Programming Machine Learning Data Data Program Output Program Output
  • 5. Sweet spot for Machine Learning • It’s impossible to write down the rules in code: • Too many rules • Too many factors influencing the rules • Too finely tuned • We just don’t know the rules (image recognition) • Lots of labeled data (examples) available (e.g. historical data) 5
  • 6. Basic Machine Learning ‘workflow’ 6 Feature Vectors Training data Labels Machine Learning Algorithm Feature Vectors New data Prediction Training Phase Operational Phase Predictive Model
  • 7. Training Phase in more detail 7 Raw data Data preparation Feature Vectors Training Data Test data Model Building (by ML algorithm) Model Evaluation Predictive Model Feedback loop data cleansing data transformation normalization feature extraction aka ‘learning’
  • 8. Examples of ML tasks Supervised learning Regression  target is numeric Classification  target is categorical 8 Unsupervised learning Clustering Dimensionality reduction
  • 9. Modeling: so many algorithms… 9
  • 10. ML Algorithms: by Representation Collection of candidate models/programs, aka hypothesis space 10 Decision trees Instance-based Neural networks Model ensembles
  • 11. ML Algorithms: by Evaluation Evaluation: Quality measure for a model 11 Regression Example metric: Root Mean Squared Error RMSE = Binary classification: confusion matrix Accuracy: 8 + 971 -> 97,9% Example: medical test for a disease Positive Negative P True positives TP False Negatives FN N False positives FP True Negatives TN True Class Predicted class Accuracy: Better evaluation metrics: • Precision: 8 / (8 + 19) • Recall: 8 / (8 + 2)
  • 12. Optimization: how the algorithm ‘learns’, depends on representation and evaluation ML Algorithms: by Optimization 12 Greedy Search, ex. of combinatorial optimization Gradient Descent (or in general: Convex Optimization) Linear Programming (or in general: Constrained/Nonlinear Optimization)
  • 13. Training error vs test error 13
  • 14. Data Science for Business • Focuses more on general principles than specific algorithms • Not math-heavy, does contain some math • O’Reilly link: http://shop.oreilly.com/product/063692 0028918.do • Book website: http://data-science-for- biz.com/DSB/Home.html 14
  • 15. What has NOT been covered (1) • Deep learning / Neural Networks • Covered in other presentations at DKOM • Also recommended for further reading (deep dive): • http://neuralnetworksanddeeplearning.com/index.html • Specifics of ML-algorithms • All over the internet… e.g. at http://machinelearningmastery.com/ 15
  • 16. What has NOT been covered (2) • Libraries (examples): • Tensorflow, Caffe, Theano, Keras • SciPy & scikit-learn • Spark MLLib (Scala/Java/Python) • Programming languages: 16
  • 17. What has NOT been covered (3) • SAP products: • SAP HANA, SAP HANA Vora, SAP BO Predictive Analytics(!), HCP Predictive Services • New machine learning platform • Hardware • Nvidia talk about GPUs 17
  • 18. What has NOT been covered (4) • Ethics and algorithmic transparency: 18
  • 19. What has NOT been covered (5) • The Data Science & Data Mining Process: 19
  • 20. What has NOT been covered (6) • How to integrate ML into your business application • I hope SAP is figuring that out as we speak ;-) • Have a look at SAP Predictive Analytics Integrator • https://help.sap.com/pai 20
  • 21. Take-aways • Goal of ML: generalize from training data (not optimization!!) • No magic! Just some clever algorithms… • Increasingly important non-technical aspects: • Ethics • Algorithmic transparency 21
  • 22. Thank You www.soapeople.com info@soapeople.com @SOAPEOPLE Fred Verheul Big Data Consultant +31 6 3919 2986 fred.verheul@soapeople.com

Notes de l'éditeur

  1. This diagram is attributed to Pedro Domingos who used it in his Coursera Machine Learning course in 2012.
  2. Source: http://timoelliott.com/blog/2007/11/thanksgiving_predictive_analyt.html
  3. Sources: Regression - http://gerardnico.com/wiki/data_mining/linear_regression Classification - ?? Clustering - https://en.wikipedia.org/wiki/Cluster_analysis Dimensionality reduction: http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization
  4. Source: http://machinelearningmastery.com/
  5. Sources: Decision Tree - https://en.wikipedia.org/wiki/Decision_tree_learning Instance-based - https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm Neural Networks - https://en.wikipedia.org/wiki/Artificial_neural_network Ensembles - https://www.analyticsvidhya.com/blog/2015/09/questions-ensemble-modeling/
  6. Sources: Greedy Search - https://en.wikipedia.org/wiki/Greedy_algorithm Gradient Descent - ?? Linear Programming - http://courses.wccnet.edu/~palay/math181/linearprogramming.htm
  7. Source: https://onlinecourses.science.psu.edu/stat857/node/160