SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
www.edureka.co/data-science
Top 5 Algorithms Used in Data Science
Slide 2 www.edureka.co/data-science
What are we going to learn today ?
At the end of the session you will be able to understand :
 What is Data Science
 What does Data Scientists do
 Top 5 Data Science Algorithms
 Decision Tree
 Random Forest
 Association Rule Mining
 Linear Regression
 K-Means Clustering
 Demo on K-Means Clustering algorithm
Slide 3 www.edureka.co/data-science
Data Science
Slide 4 www.edureka.co/data-science
What is Data Science ?
Data science is nothing but extracting meaningful and actionable knowledge from data
Slide 5 www.edureka.co/data-science
Who are Data Scientists ?
Basically data scientists are humans who have multitude of skills and who love playing with data
Slide 6 www.edureka.co/data-science
Data Science from 1000 feet
Data Science
Visualization
Data Engineering
Statistics
Advanced Computing
Domain Expertise
Slide 7 www.edureka.co/data-science
Arsenal of a Data Scientist
Data Science
Data Architecture
Tool: Hadoop
Machine Learning
Tool: Mahout, Weka, Spark MLlib
Analytics
Tool: R, Python
Note that evaluating different machine learning algorithms is a daily work of a
data scientist. So it becomes very important for a data scientist to have a good
grip over various machine learning algorithms.
Slide 8 www.edureka.co/data-science
Machine Learning
Machine Learning is a method of teaching computers to make and improve predictions based on data
Machine learning is a huge field, with hundreds of different algorithms for solving myriad different problems
Supervised Learning : The categories of the data is already known
Unsupervised Learning : The learning process attempts to find appropriate category for the data
Slide 9 www.edureka.co/data-science
Decision TreeDecision Tree
Slide 10 www.edureka.co/data-science
Decision Tree Example
Training
Data
Slide 11 www.edureka.co/data-science
Decision Tree, Root : Student
Step-1
Student
Slide 12 www.edureka.co/data-science
Decision Tree, Root : Student
Step-2
Student
Income
Income
Medium
Slide 13 www.edureka.co/data-science
Decision Tree, Root : Student
Step-3
Student
Income
Income
YES
YES
Medium
Slide 14 www.edureka.co/data-science
Decision Tree, Root : Student
Student
Income Income
Age CR
CR
YES YES31….40
Medium
Step-4
Slide 15 www.edureka.co/data-science
Decision Tree, Root : Student
Student
Income Income
Age CR
CR
No
Yes
Yes
Yes
Yes
31….40
Medium
Step-5
Slide 16 www.edureka.co/data-science
Decision Tree, Root : Student
Student
Income Income
Age CR
No
Yes
31….40
Age
Age
Yes No
No
Yes
31….40
CR
Age
Yes No
> 40
31….40
Yes
Yes Yes
Fair
Medium
Step-6
Slide 17 www.edureka.co/data-science
Decision Tree, Root : Student
 1. student(no)^income(high)^age(<=30) => buys_computer(no)
 2. student(no)^income(high)^age(31…40) => buys_computer(yes)
 3. student(no)^income(medium)^CR(fair)^age(>40) => buys_computer(yes)
 4. student(no)^income(medium)^CR(fair)^age(<=30) => buys_computer(no)
 5. student(no)^income(medium)^CR(excellent)^age(>40) => buys_computer(no)
 6. student(no)^income(medium)^CR(excellent)^age(31..40) =>buys_computer(yes)
 7. student(yes)^income(low)^CR(fair) => buys_computer(yes)
 8. student(yes)^income(low)^CR(excellent)^age(31..40) => buys_computer(yes)
 9. student(yes)^income(low)^CR(excellent)^age(>40) => buys_computer(no)
 10. student(yes)^income(medium)=> buys_computer(yes)
 11. student(yes)^income(high)=> buys_computer(yes)
Classification rules :
Slide 18 www.edureka.co/data-science
Random ForestRandom Forest
Slide 19 www.edureka.co/data-science
Random Forest : Example
Suppose you're very indecisive about
watching a movie.
“Edge of Tomorrow”
You can do one of the following :
1. Either you ask your best friend,
whether you will like the movie.
2. Or You can ask your group of friends.
Slide 20 www.edureka.co/data-science
Random Forest : Example
In order to answer, your best friend first needs
to figure out what movies you like, so you give
her a bunch of movies and tell her whether you
liked each one or not (i.e., you give her a
labelled training set)
Example:
Do you like movies starring Emily Blunt ?
Ask
Best
Friend
Is it based on a
true incident?
Does Emily
Blunt star in it?
No
Is she the
main lead?
Yes, You will like
the movie
No Yes
No, You will
not like the
movie
No, You will not
like the movie
Slide 21 www.edureka.co/data-science
Random Forest : Example
But your best friend might not always generalize your
preferences very well (i.e., she overfits)
In order to get more accurate recommendations, you'd like
to ask a bunch of your friends e.g. Friend#1, Friend#2, and
Friend#3 and they vote on whether you will like a movie
The majority of the votes will decide the final outcome
Slide 22 www.edureka.co/data-science
Random Forest : Example
You didn’t
like ‘Far and
away’
You liked
‘Oblivion’
You like action
movies
You like Tom
Cruise
You like his
pairing with
Emily Blunt
Yes, You will like
the movie
Yes, You will
like the movie
Yes, You will
like the
movie
Friend 2
You did not
like ‘Top
Gun’
You loved
‘Godzilla’
Friend 1
No, You will
not like the
movie
Yes, You will
like the
movie
You hate Tom
Cruise
Friend 3
No, You will not
like the movie
Slide 23 www.edureka.co/data-science
What is Random Forest ?
Random Forest is an ensemble classifier made using many decision tree models.
What are ensemble models?
 Ensemble models combine the results from different models.
 The result from an ensemble model is usually better than the result from one of the individual models.
Slide 24 www.edureka.co/data-science
Association Rule MiningAssociation Rule Mining
Slide 25 www.edureka.co/data-science
Association Rule Mining
Slide 26 www.edureka.co/data-science
Association Rule Mining
 Association Rule Mining is a popular and well researched method for discovering interesting
relations between variables in large data.
 The rule found in the sales data of a supermarket would indicate that if a customer buys onions
and potatoes together, he or she is likely to also buy hamburger meat.
Slide 27 www.edureka.co/data-science
Linear RegressionLinear Regression
Slide 28 www.edureka.co/data-science
Regression Analysis – Linear Regression
Regression analysis helps understand how value of dependent variable changes when any one of
independent variable changes, while other independent variables are kept fixed
Linear Regression is the most popular algorithm used for prediction and forecasting
Slide 29 www.edureka.co/data-science
K-Means ClusteringK-Means Clustering
Slide 30 www.edureka.co/data-science
K-Means Clustering
The process by which objects are classified into
a number of groups so that they are as much
dissimilar as possible from one group to another
group, but as much similar as possible within
each group.
The objects in group 1 should be as similar as
possible.
But there should be much difference between
objects in different groups
The attributes of the objects are allowed to
determine which objects should be grouped
together.
Total population
Group 1
Group 2 Group 3
Group 4
Slide 31 www.edureka.co/data-science
Hands-On
Demo K-Means Clustering
Slide 32 Course Url
Thank You …
Questions/Queries/Feedback
Recording and presentation will be made available to you within 24 hours

Contenu connexe

Tendances

Farm mechanization in india
Farm mechanization in indiaFarm mechanization in india
Farm mechanization in indiaaswath9882
 
Farm Tractor & Power Tiller.pptx
Farm Tractor & Power Tiller.pptxFarm Tractor & Power Tiller.pptx
Farm Tractor & Power Tiller.pptxAjay Singh Lodhi
 
Cyber law In India: its need & importance
Cyber law In India: its need & importanceCyber law In India: its need & importance
Cyber law In India: its need & importanceAditya Shukla
 
Ict role in agriculture
Ict role in agricultureIct role in agriculture
Ict role in agricultureAIT
 
Agricultural market intelligence system in india
Agricultural market intelligence system in indiaAgricultural market intelligence system in india
Agricultural market intelligence system in indiaKavi Priya J
 
farm efficiency.pptx
farm efficiency.pptxfarm efficiency.pptx
farm efficiency.pptxDrAnilBhat
 
FARM MECHANIZATION
FARM MECHANIZATION FARM MECHANIZATION
FARM MECHANIZATION sunil kumari
 
Farm Machinery Uses and Agricultural Machinery Industries in India: Status, E...
Farm Machinery Uses and Agricultural Machinery Industries in India: Status, E...Farm Machinery Uses and Agricultural Machinery Industries in India: Status, E...
Farm Machinery Uses and Agricultural Machinery Industries in India: Status, E...essp2
 
Paddy transplanter
Paddy transplanterPaddy transplanter
Paddy transplanterElla Green
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionkalpesh1908
 
Automation in agriculture
Automation in agricultureAutomation in agriculture
Automation in agricultureSami Asokan
 

Tendances (20)

Digital agriculture
Digital agricultureDigital agriculture
Digital agriculture
 
Farm mechanization in india
Farm mechanization in indiaFarm mechanization in india
Farm mechanization in india
 
Farm Tractor & Power Tiller.pptx
Farm Tractor & Power Tiller.pptxFarm Tractor & Power Tiller.pptx
Farm Tractor & Power Tiller.pptx
 
Cyber law In India: its need & importance
Cyber law In India: its need & importanceCyber law In India: its need & importance
Cyber law In India: its need & importance
 
Statistics and agricultural
Statistics and agriculturalStatistics and agricultural
Statistics and agricultural
 
Expert systems in agriculture
Expert systems in agricultureExpert systems in agriculture
Expert systems in agriculture
 
Ict role in agriculture
Ict role in agricultureIct role in agriculture
Ict role in agriculture
 
Agricultural market intelligence system in india
Agricultural market intelligence system in indiaAgricultural market intelligence system in india
Agricultural market intelligence system in india
 
farm efficiency.pptx
farm efficiency.pptxfarm efficiency.pptx
farm efficiency.pptx
 
e-Agriculture
e-Agriculturee-Agriculture
e-Agriculture
 
FARM MECHANIZATION
FARM MECHANIZATION FARM MECHANIZATION
FARM MECHANIZATION
 
Farm Machinery Uses and Agricultural Machinery Industries in India: Status, E...
Farm Machinery Uses and Agricultural Machinery Industries in India: Status, E...Farm Machinery Uses and Agricultural Machinery Industries in India: Status, E...
Farm Machinery Uses and Agricultural Machinery Industries in India: Status, E...
 
Emerging Demand for Tractor Mechanization in Ethiopia
Emerging Demand for  Tractor Mechanization in Ethiopia Emerging Demand for  Tractor Mechanization in Ethiopia
Emerging Demand for Tractor Mechanization in Ethiopia
 
Aee 302
Aee 302Aee 302
Aee 302
 
Access to agricultural finance
Access to agricultural financeAccess to agricultural finance
Access to agricultural finance
 
Paddy transplanter
Paddy transplanterPaddy transplanter
Paddy transplanter
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
IoT in Agriculture
IoT in AgricultureIoT in Agriculture
IoT in Agriculture
 
eAgriculture
eAgricultureeAgriculture
eAgriculture
 
Automation in agriculture
Automation in agricultureAutomation in agriculture
Automation in agriculture
 

En vedette

Health care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureHealth care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureEdureka!
 
Big Data Analytics for Non-Programmers
Big Data Analytics for Non-ProgrammersBig Data Analytics for Non-Programmers
Big Data Analytics for Non-ProgrammersEdureka!
 
Big Data Processing with Spark and Scala
Big Data Processing with Spark and Scala Big Data Processing with Spark and Scala
Big Data Processing with Spark and Scala Edureka!
 
Spark for big data analytics
Spark for big data analyticsSpark for big data analytics
Spark for big data analyticsEdureka!
 
Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Edureka!
 
Mastering in data warehousing & BusinessIintelligence
Mastering in data warehousing & BusinessIintelligenceMastering in data warehousing & BusinessIintelligence
Mastering in data warehousing & BusinessIintelligenceEdureka!
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Onlinesfdatascience
 
Power of Python with Big Data
Power of Python with Big DataPower of Python with Big Data
Power of Python with Big DataEdureka!
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenEdureka!
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data AnalyticsEdureka!
 
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...Edureka!
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Edureka!
 

En vedette (12)

Health care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cureHealth care and big data with hadoop – Beacuse prevention is better than cure
Health care and big data with hadoop – Beacuse prevention is better than cure
 
Big Data Analytics for Non-Programmers
Big Data Analytics for Non-ProgrammersBig Data Analytics for Non-Programmers
Big Data Analytics for Non-Programmers
 
Big Data Processing with Spark and Scala
Big Data Processing with Spark and Scala Big Data Processing with Spark and Scala
Big Data Processing with Spark and Scala
 
Spark for big data analytics
Spark for big data analyticsSpark for big data analytics
Spark for big data analytics
 
Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!Is Data Scientist still the sexiest job of 21st century? Find Out!
Is Data Scientist still the sexiest job of 21st century? Find Out!
 
Mastering in data warehousing & BusinessIintelligence
Mastering in data warehousing & BusinessIintelligenceMastering in data warehousing & BusinessIintelligence
Mastering in data warehousing & BusinessIintelligence
 
Clare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science OnlineClare Corthell: Learning Data Science Online
Clare Corthell: Learning Data Science Online
 
Power of Python with Big Data
Power of Python with Big DataPower of Python with Big Data
Power of Python with Big Data
 
R and Visualization: A match made in Heaven
R and Visualization: A match made in HeavenR and Visualization: A match made in Heaven
R and Visualization: A match made in Heaven
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
 
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
 
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
 

Similaire à Top 5 algorithms used in Data Science

Data Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsData Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsEdureka!
 
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Edureka!
 
Business Analytics Decision Tree in R
Business Analytics Decision Tree in RBusiness Analytics Decision Tree in R
Business Analytics Decision Tree in REdureka!
 
Greg Wilson - We Know (but ignore) More Than We Think
Greg Wilson - We Know (but ignore) More Than We ThinkGreg Wilson - We Know (but ignore) More Than We Think
Greg Wilson - We Know (but ignore) More Than We Think#DevTO
 
Data Science Isn't a Fad: Let's Keep it That Way
Data Science Isn't a Fad: Let's Keep it That WayData Science Isn't a Fad: Let's Keep it That Way
Data Science Isn't a Fad: Let's Keep it That WayMelinda Thielbar
 
The Quest for Learner Engagement
The Quest for Learner EngagementThe Quest for Learner Engagement
The Quest for Learner EngagementKarl Kapp
 
Inferring networks of substitute and complementary products
Inferring networks of substitute and complementary productsInferring networks of substitute and complementary products
Inferring networks of substitute and complementary productsTuri, Inc.
 
Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Edureka!
 
Data Science: The Product Manager's Primer
Data Science: The Product Manager's PrimerData Science: The Product Manager's Primer
Data Science: The Product Manager's PrimerProduct School
 
Logistic Regression In Data Science
Logistic Regression In Data ScienceLogistic Regression In Data Science
Logistic Regression In Data ScienceEdureka!
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratorySara Hooker
 
Assignment 2 Cronbach Alphas Essay
Assignment 2 Cronbach Alphas EssayAssignment 2 Cronbach Alphas Essay
Assignment 2 Cronbach Alphas EssayDawn Robertson
 
Sentiment Analysis In Retail Domain
Sentiment Analysis In Retail DomainSentiment Analysis In Retail Domain
Sentiment Analysis In Retail DomainEdureka!
 
1 decisiontree dtree18[1]
1 decisiontree dtree18[1]1 decisiontree dtree18[1]
1 decisiontree dtree18[1]翀莺 缪
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.docbutest
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.docbutest
 
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSMSunView Software, Inc.
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with REdureka!
 

Similaire à Top 5 algorithms used in Data Science (20)

Data Science : Make Smarter Business Decisions
Data Science : Make Smarter Business DecisionsData Science : Make Smarter Business Decisions
Data Science : Make Smarter Business Decisions
 
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
 
Business Analytics Decision Tree in R
Business Analytics Decision Tree in RBusiness Analytics Decision Tree in R
Business Analytics Decision Tree in R
 
Greg Wilson - We Know (but ignore) More Than We Think
Greg Wilson - We Know (but ignore) More Than We ThinkGreg Wilson - We Know (but ignore) More Than We Think
Greg Wilson - We Know (but ignore) More Than We Think
 
Data Science Isn't a Fad: Let's Keep it That Way
Data Science Isn't a Fad: Let's Keep it That WayData Science Isn't a Fad: Let's Keep it That Way
Data Science Isn't a Fad: Let's Keep it That Way
 
The Quest for Learner Engagement
The Quest for Learner EngagementThe Quest for Learner Engagement
The Quest for Learner Engagement
 
Inferring networks of substitute and complementary products
Inferring networks of substitute and complementary productsInferring networks of substitute and complementary products
Inferring networks of substitute and complementary products
 
Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples Application of Clustering in Data Science using Real-life Examples
Application of Clustering in Data Science using Real-life Examples
 
Data Science: The Product Manager's Primer
Data Science: The Product Manager's PrimerData Science: The Product Manager's Primer
Data Science: The Product Manager's Primer
 
Logistic Regression In Data Science
Logistic Regression In Data ScienceLogistic Regression In Data Science
Logistic Regression In Data Science
 
Module 1.3 data exploratory
Module 1.3  data exploratoryModule 1.3  data exploratory
Module 1.3 data exploratory
 
Assignment 2 Cronbach Alphas Essay
Assignment 2 Cronbach Alphas EssayAssignment 2 Cronbach Alphas Essay
Assignment 2 Cronbach Alphas Essay
 
Sentiment Analysis In Retail Domain
Sentiment Analysis In Retail DomainSentiment Analysis In Retail Domain
Sentiment Analysis In Retail Domain
 
1 decisiontree dtree18[1]
1 decisiontree dtree18[1]1 decisiontree dtree18[1]
1 decisiontree dtree18[1]
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.doc
 
learningIntro.doc
learningIntro.doclearningIntro.doc
learningIntro.doc
 
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM[Webinar] How Big Data and Machine Learning Are Transforming ITSM
[Webinar] How Big Data and Machine Learning Are Transforming ITSM
 
Business Analytics with R
Business Analytics with RBusiness Analytics with R
Business Analytics with R
 
Tech ed
Tech edTech ed
Tech ed
 

Plus de Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaEdureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaEdureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaEdureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaEdureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaEdureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaEdureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaEdureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaEdureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaEdureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaEdureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | EdurekaEdureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEdureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEdureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaEdureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaEdureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaEdureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaEdureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaEdureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | EdurekaEdureka!
 

Plus de Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Dernier

IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 

Dernier (20)

IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 

Top 5 algorithms used in Data Science

  • 2. Slide 2 www.edureka.co/data-science What are we going to learn today ? At the end of the session you will be able to understand :  What is Data Science  What does Data Scientists do  Top 5 Data Science Algorithms  Decision Tree  Random Forest  Association Rule Mining  Linear Regression  K-Means Clustering  Demo on K-Means Clustering algorithm
  • 4. Slide 4 www.edureka.co/data-science What is Data Science ? Data science is nothing but extracting meaningful and actionable knowledge from data
  • 5. Slide 5 www.edureka.co/data-science Who are Data Scientists ? Basically data scientists are humans who have multitude of skills and who love playing with data
  • 6. Slide 6 www.edureka.co/data-science Data Science from 1000 feet Data Science Visualization Data Engineering Statistics Advanced Computing Domain Expertise
  • 7. Slide 7 www.edureka.co/data-science Arsenal of a Data Scientist Data Science Data Architecture Tool: Hadoop Machine Learning Tool: Mahout, Weka, Spark MLlib Analytics Tool: R, Python Note that evaluating different machine learning algorithms is a daily work of a data scientist. So it becomes very important for a data scientist to have a good grip over various machine learning algorithms.
  • 8. Slide 8 www.edureka.co/data-science Machine Learning Machine Learning is a method of teaching computers to make and improve predictions based on data Machine learning is a huge field, with hundreds of different algorithms for solving myriad different problems Supervised Learning : The categories of the data is already known Unsupervised Learning : The learning process attempts to find appropriate category for the data
  • 10. Slide 10 www.edureka.co/data-science Decision Tree Example Training Data
  • 11. Slide 11 www.edureka.co/data-science Decision Tree, Root : Student Step-1 Student
  • 12. Slide 12 www.edureka.co/data-science Decision Tree, Root : Student Step-2 Student Income Income Medium
  • 13. Slide 13 www.edureka.co/data-science Decision Tree, Root : Student Step-3 Student Income Income YES YES Medium
  • 14. Slide 14 www.edureka.co/data-science Decision Tree, Root : Student Student Income Income Age CR CR YES YES31….40 Medium Step-4
  • 15. Slide 15 www.edureka.co/data-science Decision Tree, Root : Student Student Income Income Age CR CR No Yes Yes Yes Yes 31….40 Medium Step-5
  • 16. Slide 16 www.edureka.co/data-science Decision Tree, Root : Student Student Income Income Age CR No Yes 31….40 Age Age Yes No No Yes 31….40 CR Age Yes No > 40 31….40 Yes Yes Yes Fair Medium Step-6
  • 17. Slide 17 www.edureka.co/data-science Decision Tree, Root : Student  1. student(no)^income(high)^age(<=30) => buys_computer(no)  2. student(no)^income(high)^age(31…40) => buys_computer(yes)  3. student(no)^income(medium)^CR(fair)^age(>40) => buys_computer(yes)  4. student(no)^income(medium)^CR(fair)^age(<=30) => buys_computer(no)  5. student(no)^income(medium)^CR(excellent)^age(>40) => buys_computer(no)  6. student(no)^income(medium)^CR(excellent)^age(31..40) =>buys_computer(yes)  7. student(yes)^income(low)^CR(fair) => buys_computer(yes)  8. student(yes)^income(low)^CR(excellent)^age(31..40) => buys_computer(yes)  9. student(yes)^income(low)^CR(excellent)^age(>40) => buys_computer(no)  10. student(yes)^income(medium)=> buys_computer(yes)  11. student(yes)^income(high)=> buys_computer(yes) Classification rules :
  • 19. Slide 19 www.edureka.co/data-science Random Forest : Example Suppose you're very indecisive about watching a movie. “Edge of Tomorrow” You can do one of the following : 1. Either you ask your best friend, whether you will like the movie. 2. Or You can ask your group of friends.
  • 20. Slide 20 www.edureka.co/data-science Random Forest : Example In order to answer, your best friend first needs to figure out what movies you like, so you give her a bunch of movies and tell her whether you liked each one or not (i.e., you give her a labelled training set) Example: Do you like movies starring Emily Blunt ? Ask Best Friend Is it based on a true incident? Does Emily Blunt star in it? No Is she the main lead? Yes, You will like the movie No Yes No, You will not like the movie No, You will not like the movie
  • 21. Slide 21 www.edureka.co/data-science Random Forest : Example But your best friend might not always generalize your preferences very well (i.e., she overfits) In order to get more accurate recommendations, you'd like to ask a bunch of your friends e.g. Friend#1, Friend#2, and Friend#3 and they vote on whether you will like a movie The majority of the votes will decide the final outcome
  • 22. Slide 22 www.edureka.co/data-science Random Forest : Example You didn’t like ‘Far and away’ You liked ‘Oblivion’ You like action movies You like Tom Cruise You like his pairing with Emily Blunt Yes, You will like the movie Yes, You will like the movie Yes, You will like the movie Friend 2 You did not like ‘Top Gun’ You loved ‘Godzilla’ Friend 1 No, You will not like the movie Yes, You will like the movie You hate Tom Cruise Friend 3 No, You will not like the movie
  • 23. Slide 23 www.edureka.co/data-science What is Random Forest ? Random Forest is an ensemble classifier made using many decision tree models. What are ensemble models?  Ensemble models combine the results from different models.  The result from an ensemble model is usually better than the result from one of the individual models.
  • 24. Slide 24 www.edureka.co/data-science Association Rule MiningAssociation Rule Mining
  • 26. Slide 26 www.edureka.co/data-science Association Rule Mining  Association Rule Mining is a popular and well researched method for discovering interesting relations between variables in large data.  The rule found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, he or she is likely to also buy hamburger meat.
  • 27. Slide 27 www.edureka.co/data-science Linear RegressionLinear Regression
  • 28. Slide 28 www.edureka.co/data-science Regression Analysis – Linear Regression Regression analysis helps understand how value of dependent variable changes when any one of independent variable changes, while other independent variables are kept fixed Linear Regression is the most popular algorithm used for prediction and forecasting
  • 29. Slide 29 www.edureka.co/data-science K-Means ClusteringK-Means Clustering
  • 30. Slide 30 www.edureka.co/data-science K-Means Clustering The process by which objects are classified into a number of groups so that they are as much dissimilar as possible from one group to another group, but as much similar as possible within each group. The objects in group 1 should be as similar as possible. But there should be much difference between objects in different groups The attributes of the objects are allowed to determine which objects should be grouped together. Total population Group 1 Group 2 Group 3 Group 4
  • 32. Slide 32 Course Url Thank You … Questions/Queries/Feedback Recording and presentation will be made available to you within 24 hours