SlideShare a Scribd company logo
1 of 8
Download to read offline
MK99 – Big Data 
1 
Big data & cross-platform analytics 
MOOC lectures Pr. Clement Levallois
MK99 – Big Data 
2 
A short note on machine learning for business
MK99 – Big Data 
3 
Machine Learning 
• Family of techniques to formulate predictions, based on data 
•Why is it called Machine learning? 
–Machine: it is about algorithms running on computers, not equations solved with pen and paper 
–Learning: the algorithms start with zero accuracy. Then, they get more accurate while being fed with data: the algorithm refines its parameters, it “learns”.
MK99 – Big Data 
4 
Typical set up 
1.We start with a training set 
Data already collected: we know the actual values to be found 
Ex: a list of consumers, their characteristics and their associated credit score 
2.The algorithms are trained on this set 
-> A series of algorithms run on the training set. Their parameters get adjusted so that the actual values get progressively predicted the most accurately possible. 
3.A test set (“fresh data”) is brought 
-> List of consumer characteristics. Their credit score is known but hidden. 
4.Running the trained algo on the test set 
-> Predict the credit score for each consumer in the test set, using the algorithms that were trained on phase 1 
5.A measure of accuracy 
- Given the correct values to be predicted in the test set, how accurate were the algorithms? 
-> Where the credit scores accurately predicted? 
Actual values
MK99 – Big Data 
5 
Vocabulary 
•Data scientists “train” their model and then test it 
•They are concerned by “out-of-sample” prediction 
–The fact that their model predicts accurately data points in the training set (the “sample”) is trivial 
–This is the accuracy on the test set that matters! 
–This is called an “out-of-sample” prediction
MK99 – Big Data 
6 
Why is machine learning (ML) so different from statistics? 
•ML does not focus on causality – just prediction! 
–Note: for this reason, ML cannot predict the effect of intervention - it has no causal model. 
•ML has a special concern for out-of-sample prediction 
–Will be especially careful about over-fitting 
•ML picks its algorithms from diff academic disciplines 
–Text, network relations, clustering, not just traditional statistics 
•Coming from comput. sciences, ML has affinities with big data 
–Procedures optimized for speed and scale 
But the best data scientists often started as statisticians / econometricians: 
See Hal Varian: Chief Economist at Google
MK99 – Big Data 
7 
•Kaggle is a website hosting ML competitions, anybody can join 
•Goal: make the best prediction on a dataset, with cash prizes 
•From predicting clicks on ads to epileptic seizures 
•Always the same setup: a training set, a test set, a scoring based on accuracy.
MK99 – Big Data 
8 
This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com) 
Contact Clement Levallois (levallois [at] em-lyon.com) for more information.

More Related Content

What's hot

Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)
Krishnaram Kenthapadi
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 

What's hot (20)

Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
 
Introduction to Data Science.pptx
Introduction to Data Science.pptxIntroduction to Data Science.pptx
Introduction to Data Science.pptx
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
Explainability for NLP
Explainability for NLPExplainability for NLP
Explainability for NLP
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
The data quality challenge
The data quality challengeThe data quality challenge
The data quality challenge
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
 
AI and Data Science.pdf
AI and Data Science.pdfAI and Data Science.pdf
AI and Data Science.pdf
 
Responsible AI
Responsible AIResponsible AI
Responsible AI
 
Data analytics
Data analyticsData analytics
Data analytics
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Analytics Maturity Model
Analytics Maturity ModelAnalytics Maturity Model
Analytics Maturity Model
 

Viewers also liked

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Lior Rokach
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
butest
 

Viewers also liked (9)

Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
Transparent Machine Learning for Information Extraction: State-of-the-Art and...Transparent Machine Learning for Information Extraction: State-of-the-Art and...
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
 
Machine learning
Machine learningMachine learning
Machine learning
 
L1. State of the Art in Machine Learning
L1. State of the Art in Machine LearningL1. State of the Art in Machine Learning
L1. State of the Art in Machine Learning
 
Machine learning ~ Forecasting
Machine learning ~ ForecastingMachine learning ~ Forecasting
Machine learning ~ Forecasting
 
An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)An introduction to Machine Learning (and a little bit of Deep Learning)
An introduction to Machine Learning (and a little bit of Deep Learning)
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Similar to An explanation of machine learning for business

Similar to An explanation of machine learning for business (20)

Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
 
ML_Module_1.pdf
ML_Module_1.pdfML_Module_1.pdf
ML_Module_1.pdf
 
Machine learning
Machine learningMachine learning
Machine learning
 
Pricing like a data scientist
Pricing like a data scientistPricing like a data scientist
Pricing like a data scientist
 
Machine Learning and Analytics in Splunk
Machine Learning and Analytics in SplunkMachine Learning and Analytics in Splunk
Machine Learning and Analytics in Splunk
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Course 2 Machine Learning Data LifeCycle in Production - Week 1
Course 2   Machine Learning Data LifeCycle in Production - Week 1Course 2   Machine Learning Data LifeCycle in Production - Week 1
Course 2 Machine Learning Data LifeCycle in Production - Week 1
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
Machine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdfMachine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdf
 
Machine Learning with Big Data using Apache Spark
Machine Learning with Big Data using Apache SparkMachine Learning with Big Data using Apache Spark
Machine Learning with Big Data using Apache Spark
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
 
Machine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout SessionMachine Learning and Analytics Breakout Session
Machine Learning and Analytics Breakout Session
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
 
Machine learning and big data
Machine learning and big dataMachine learning and big data
Machine learning and big data
 
Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack
 
artificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdfartificggggggggggggggialintelligence.pdf
artificggggggggggggggialintelligence.pdf
 
ML PPT-1.pptx
ML PPT-1.pptxML PPT-1.pptx
ML PPT-1.pptx
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
 

More from Clement Levallois

More from Clement Levallois (13)

Part 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accountsPart 2: covid-19 on Twitter, with a focus on 3 new seed accounts
Part 2: covid-19 on Twitter, with a focus on 3 new seed accounts
 
Education et intelligence artificielle
Education et intelligence artificielleEducation et intelligence artificielle
Education et intelligence artificielle
 
3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business3 familles d'intelligence artificielle et leurs applications business
3 familles d'intelligence artificielle et leurs applications business
 
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
Présentation FrenchWeb: Qu'est-ce que la visualisation des données?
 
Presentation of programming languages for beginners
Presentation of programming languages for beginnersPresentation of programming languages for beginners
Presentation of programming languages for beginners
 
Umigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroomUmigon: crowdsourcing in the classroom
Umigon: crowdsourcing in the classroom
 
Data visualization: enjeux pour le business
Data visualization: enjeux pour le businessData visualization: enjeux pour le business
Data visualization: enjeux pour le business
 
Twitter for beginners
Twitter for beginnersTwitter for beginners
Twitter for beginners
 
Data and personalization
Data and personalizationData and personalization
Data and personalization
 
A Primer on Text Mining for Business
A Primer on Text Mining for BusinessA Primer on Text Mining for Business
A Primer on Text Mining for Business
 
The business stakes of data integration
The business stakes of data integrationThe business stakes of data integration
The business stakes of data integration
 
What is big data?
What is big data?What is big data?
What is big data?
 
What is "data"?
What is "data"?What is "data"?
What is "data"?
 

Recently uploaded

Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan CytotecJual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
ZurliaSoop
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
allensay1
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
daisycvs
 
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in OmanMifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
instagramfab782445
 

Recently uploaded (20)

Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024Marel Q1 2024 Investor Presentation from May 8, 2024
Marel Q1 2024 Investor Presentation from May 8, 2024
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
 
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAIGetting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
 
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
Escorts in Nungambakkam Phone 8250092165 Enjoy 24/7 Escort Service Enjoy Your...
 
Pre Engineered Building Manufacturers Hyderabad.pptx
Pre Engineered  Building Manufacturers Hyderabad.pptxPre Engineered  Building Manufacturers Hyderabad.pptx
Pre Engineered Building Manufacturers Hyderabad.pptx
 
Falcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to ProsperityFalcon's Invoice Discounting: Your Path to Prosperity
Falcon's Invoice Discounting: Your Path to Prosperity
 
Falcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial WingsFalcon Invoice Discounting: Tailored Financial Wings
Falcon Invoice Discounting: Tailored Financial Wings
 
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan CytotecJual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
 
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
 
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All TimeCall 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
Call 7737669865 Vadodara Call Girls Service at your Door Step Available All Time
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
 
How to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League CityHow to Get Started in Social Media for Art League City
How to Get Started in Social Media for Art League City
 
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165Lucknow Housewife Escorts  by Sexy Bhabhi Service 8250092165
Lucknow Housewife Escorts by Sexy Bhabhi Service 8250092165
 
Over the Top (OTT) Market Size & Growth Outlook 2024-2030
Over the Top (OTT) Market Size & Growth Outlook 2024-2030Over the Top (OTT) Market Size & Growth Outlook 2024-2030
Over the Top (OTT) Market Size & Growth Outlook 2024-2030
 
Power point presentation on enterprise performance management
Power point presentation on enterprise performance managementPower point presentation on enterprise performance management
Power point presentation on enterprise performance management
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai KuwaitThe Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
The Abortion pills for sale in Qatar@Doha [+27737758557] []Deira Dubai Kuwait
 
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...joint cost.pptx  COST ACCOUNTING  Sixteenth Edition                          ...
joint cost.pptx COST ACCOUNTING Sixteenth Edition ...
 
Famous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st CenturyFamous Olympic Siblings from the 21st Century
Famous Olympic Siblings from the 21st Century
 
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in OmanMifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
Mifepristone Available in Muscat +918761049707^^ €€ Buy Abortion Pills in Oman
 

An explanation of machine learning for business

  • 1. MK99 – Big Data 1 Big data & cross-platform analytics MOOC lectures Pr. Clement Levallois
  • 2. MK99 – Big Data 2 A short note on machine learning for business
  • 3. MK99 – Big Data 3 Machine Learning • Family of techniques to formulate predictions, based on data •Why is it called Machine learning? –Machine: it is about algorithms running on computers, not equations solved with pen and paper –Learning: the algorithms start with zero accuracy. Then, they get more accurate while being fed with data: the algorithm refines its parameters, it “learns”.
  • 4. MK99 – Big Data 4 Typical set up 1.We start with a training set Data already collected: we know the actual values to be found Ex: a list of consumers, their characteristics and their associated credit score 2.The algorithms are trained on this set -> A series of algorithms run on the training set. Their parameters get adjusted so that the actual values get progressively predicted the most accurately possible. 3.A test set (“fresh data”) is brought -> List of consumer characteristics. Their credit score is known but hidden. 4.Running the trained algo on the test set -> Predict the credit score for each consumer in the test set, using the algorithms that were trained on phase 1 5.A measure of accuracy - Given the correct values to be predicted in the test set, how accurate were the algorithms? -> Where the credit scores accurately predicted? Actual values
  • 5. MK99 – Big Data 5 Vocabulary •Data scientists “train” their model and then test it •They are concerned by “out-of-sample” prediction –The fact that their model predicts accurately data points in the training set (the “sample”) is trivial –This is the accuracy on the test set that matters! –This is called an “out-of-sample” prediction
  • 6. MK99 – Big Data 6 Why is machine learning (ML) so different from statistics? •ML does not focus on causality – just prediction! –Note: for this reason, ML cannot predict the effect of intervention - it has no causal model. •ML has a special concern for out-of-sample prediction –Will be especially careful about over-fitting •ML picks its algorithms from diff academic disciplines –Text, network relations, clustering, not just traditional statistics •Coming from comput. sciences, ML has affinities with big data –Procedures optimized for speed and scale But the best data scientists often started as statisticians / econometricians: See Hal Varian: Chief Economist at Google
  • 7. MK99 – Big Data 7 •Kaggle is a website hosting ML competitions, anybody can join •Goal: make the best prediction on a dataset, with cash prizes •From predicting clicks on ads to epileptic seizures •Always the same setup: a training set, a test set, a scoring based on accuracy.
  • 8. MK99 – Big Data 8 This slide presentation is part of a course offered by EMLYON Business School (www.em-lyon.com) Contact Clement Levallois (levallois [at] em-lyon.com) for more information.