SlideShare une entreprise Scribd logo
1  sur  25
A presentation on
Compiled by :
Tonmoy Bhagawati,
DC2013MTC0033
Mtech 2nd Semester,DBCET
Subject : Computational
Intelligence
An example application
 An emergency room in a hospital measures 17
variables (e.g., blood pressure, age, etc) of newly
admitted patients.
 A decision is needed: whether to put a new patient in
an intensive-care unit.
 Due to the high cost of ICU, those patients who may
survive less than a month are given higher priority.
 Problem: to predict high-risk patients and
discriminate them from low-risk patients.
Another application
 A credit card company receives thousands of applications
for new cards. Each application contains information about
an applicant,
 age
 Marital status
 annual salary
 outstanding debts
 credit rating
 etc.
 Problem: to decide whether an application should
approved, or to classify applications into two
categories, approved and not approved.
Machine learning and our focus
 Like human learning from past experiences.
 A computer does not have “experiences”.
 A computer system learns from data, which represent
some “past experiences” of an application domain.
 Our focus: learn a target function that can be used to
predict the values of a discrete class
attribute, e.g., approve or not-approved, and high-risk or
low risk.
 The task is commonly called: Supervised
learning, classification, or inductive learning.
 Data: A set of data records (also called examples, instances
or cases) described by
 k attributes: A1, A2, … Ak.
 a class: Each example is labelled with a pre-defined
class.
 Goal: To learn a classification model from the data that can
be used to predict the classes of new (future, or test)
cases/instances.
The data and the goal
An example: data (loan application)
Approved or not
An example: the learning task
 Learn a classification model from the data
 Use the model to classify future loan applications into
 Yes (approved) and
 No (not approved)
 What is the class for following case/instance?
Supervised vs. unsupervised Learning
 Supervised learning: classification is seen as supervised
learning from examples.
 Supervision: The data (observations, measurements, etc.)
are labeled with pre-defined classes. It is like that a
“teacher” gives the classes (supervision).
 Test data are classified into these classes too.
 Unsupervised learning (clustering)
 Class labels of the data are unknown
 Given a set of data, the task is to establish the existence of
classes or clusters in the data
Supervised learning process: two steps
Learning (training): Learn a model using the training data
Testing: Test the model using unseen test data to assess the
model accuracy
,
casestestofnumberTotal
tionsclassificacorrectofNumber
Accuracy
What do we mean by learning?
 Given
 a data set D,
 a task T, and
 a performance measure M,
a computer system is said to learn from D to perform the
task T if after learning the system’s performance on T
improves as measured by M.
 In other words, the learned model helps the system to
perform T better as compared to no learning.
An example
 Data: Loan application data
 Task: Predict whether a loan should be approved or not.
 Performance measure: accuracy.
No learning: classify all future applications (test data) to the
majority class (i.e., Yes):
Accuracy = 9/15 = 60%.
 We can do better than 60% with learning.
Fundamental assumption of learning
Assumption: The distribution of training examples is identical
to the distribution of test examples (including future
unseen examples).
 In practice, this assumption is often violated to certain
degree.
 Strong violations will clearly result in poor classification
accuracy.
 To achieve good accuracy on the test data, training
examples must be sufficiently representative of the test
data.
Decision trees
 Decision tree learning is one of the most widely used
techniques for classification.
 Its classification accuracy is competitive with other
methods, and
 it is very efficient.
 The classification model is a tree, called decision tree.
 C4.5 by Ross Quinlan is perhaps the best known system
The loan data (reproduced)
Approved or not
A decision tree from the loan data
Decision nodes and leaf nodes (classes)
Use the decision tree
No
Evaluating classification methods
 Predictive accuracy
 Efficiency
 time to construct the model
 time to use the model
 Robustness: handling noise and missing values
 Scalability: efficiency in disk-resident databases
 Interpretability:
 understandable and insight provided by the model
 Compactness of the model: size of the tree, or the number
of rules.
Evaluation methods
 Holdout set: The available data set D is divided into two
disjoint subsets,
 the training set Dtrain (for learning a model)
 the test set Dtest (for testing the model)
 Important: training set should not be used in testing and
the test set should not be used in learning.
 Unseen test set provides a unbiased estimate of accuracy.
 The test set is also called the holdout set. (the examples in
the original data set D are all labeled with classes.)
 This method is mainly used when the data set D is large.
Evaluation methods (cont…)
 n-fold cross-validation: The available data is partitioned
into n equal-size disjoint subsets.
 Use each subset as the test set and combine the rest n-1
subsets as the training set to learn a classifier.
 The procedure is run n times, which give n accuracies.
 The final estimated accuracy of learning is the average of the
n accuracies.
 10-fold and 5-fold cross-validations are commonly used.
 This method is used when the available data is not large.
CS583, Bing Liu, UIC20
Evaluation methods (cont…)
 Leave-one-out cross-validation: This method is used
when the data set is very small.
 It is a special case of cross-validation
 Each fold of the cross validation has only a single test
example and all the rest of the data is used in training.
 If the original data has m examples, this is m-fold cross-
validation
CS583, Bing Liu, UIC21
Evaluation methods (cont…)
 Validation set: the available data is divided into three
subsets,
 a training set,
 a validation set and
 a test set.
 A validation set is used frequently for estimating
parameters in learning algorithms.
 In such cases, the values that give the best accuracy on the
validation set are used as the final parameter values.
 Cross-validation can be used for parameter estimating as
well.
CS583, Bing Liu, UIC22
Classification measures
 Accuracy is only one measure (error = 1-accuracy).
 Accuracy is not suitable in some applications.
 In classification involving skewed or highly imbalanced
data, e.g., network intrusion and financial fraud detections,
we are interested only in the minority class.
 High accuracy does not mean any intrusion is detected.
 E.g., 1% intrusion. Achieve 99% accuracy by doing
nothing.
 The class of interest is commonly called the positive class,
and the rest negative classes.
Back propagation
 Backpropagation is a systematic method for training multiple-layer ANNs.
 The standard backpropagation algorithm is as follows :
1. Build a network with a chosen number of input, hidden and output
units.
2. Initialize all the weights to low random values
3. Choose a single training pair at random
4. Copy the input pattern to the input layer
5. Cycle the network so that the activations from the inputs generate the
activations in the hidden and output layers
6. Calculate the error derivative between the output activation and the
target output.
7. Backpropagate the summed products of the weights and errors in the
output layer in order to calculate the error in the hidden units.
8. Update the weights attached to each unit according to the error in that
unit, the output from the unit below it, and the learning parameters,
until the error is sufficiently low or the network settles.
CS583, Bing Liu, UIC24
Conclusion
 Applications of supervised learning are in almost any field
or domain.
 There are numerous classification techniques.
 There are still many other methods, e.g.,
 Bayesian networks
 Neural networks
 Genetic algorithms
 Fuzzy classification
This large number of methods also show the importance of
classification and its wide applicability.
 It remains to be an active research area.
References :
 Artificial Intelligence and Intelligent systems by N.P. Padhy
 Presentation on Supervised Learning by Bing Liu.
 Artificial Neural Networks for Beginners by Carlos
Gershenson
 Computational Intelligence : An Introduction by Andries P.
Engelbrecht

Contenu connexe

Tendances

Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Marina Santini
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine LearningKnoldus Inc.
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaSupervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaEdureka!
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersFunctional Imperative
 
Confusion Matrix
Confusion MatrixConfusion Matrix
Confusion MatrixRajat Gupta
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset PreparationAndrew Ferlitsch
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning Gopal Sakarkar
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised LearningLukas Tencer
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningEng Teong Cheah
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic RegressionKnoldus Inc.
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierNeha Kulkarni
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Usama Fayyaz
 

Tendances (20)

Clustering
ClusteringClustering
Clustering
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | EdurekaSupervised vs Unsupervised vs Reinforcement Learning | Edureka
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Machine learning clustering
Machine learning clusteringMachine learning clustering
Machine learning clustering
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Confusion Matrix
Confusion MatrixConfusion Matrix
Confusion Matrix
 
Deep Neural Networks (DNN)
Deep Neural Networks (DNN)Deep Neural Networks (DNN)
Deep Neural Networks (DNN)
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset Preparation
 
Data preprocessing using Machine Learning
Data  preprocessing using Machine Learning Data  preprocessing using Machine Learning
Data preprocessing using Machine Learning
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
 

Similaire à Presentation on supervised learning

dataminingclassificationprediction123 .pptx
dataminingclassificationprediction123 .pptxdataminingclassificationprediction123 .pptx
dataminingclassificationprediction123 .pptxAsrithaKorupolu
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruptionjagan477830
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsAM Publications
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applicationsBenjaminlapid1
 
classification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdfclassification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdf321106410027
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXmlaij
 
Data mining chapter04and5-best
Data mining chapter04and5-bestData mining chapter04and5-best
Data mining chapter04and5-bestABDUmomo
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data miningEr. Nawaraj Bhandari
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 
Artificial Intelligence.pptx
Artificial Intelligence.pptxArtificial Intelligence.pptx
Artificial Intelligence.pptxKaviya452563
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
Information Retrieval 08
Information Retrieval 08 Information Retrieval 08
Information Retrieval 08 Jeet Das
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Jeet Das
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersSatyam Jaiswal
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxrajalakshmi5921
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxiaeronlineexm
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2Gokulks007
 

Similaire à Presentation on supervised learning (20)

dataminingclassificationprediction123 .pptx
dataminingclassificationprediction123 .pptxdataminingclassificationprediction123 .pptx
dataminingclassificationprediction123 .pptx
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning Algorithms
 
Supervised learning techniques and applications
Supervised learning techniques and applicationsSupervised learning techniques and applications
Supervised learning techniques and applications
 
classification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdfclassification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdf
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
MACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOXMACHINE LEARNING TOOLBOX
MACHINE LEARNING TOOLBOX
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
 
Data mining chapter04and5-best
Data mining chapter04and5-bestData mining chapter04and5-best
Data mining chapter04and5-best
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data mining
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
Artificial Intelligence.pptx
Artificial Intelligence.pptxArtificial Intelligence.pptx
Artificial Intelligence.pptx
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
Artificial Neural Networks , Recurrent networks , Perceptron's
Artificial Neural Networks , Recurrent networks , Perceptron'sArtificial Neural Networks , Recurrent networks , Perceptron's
Artificial Neural Networks , Recurrent networks , Perceptron's
 
Information Retrieval 08
Information Retrieval 08 Information Retrieval 08
Information Retrieval 08
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
 
Machine Learning Interview Questions and Answers
Machine Learning Interview Questions and AnswersMachine Learning Interview Questions and Answers
Machine Learning Interview Questions and Answers
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptx
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Presentation on supervised learning

  • 1. A presentation on Compiled by : Tonmoy Bhagawati, DC2013MTC0033 Mtech 2nd Semester,DBCET Subject : Computational Intelligence
  • 2. An example application  An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients.  A decision is needed: whether to put a new patient in an intensive-care unit.  Due to the high cost of ICU, those patients who may survive less than a month are given higher priority.  Problem: to predict high-risk patients and discriminate them from low-risk patients.
  • 3. Another application  A credit card company receives thousands of applications for new cards. Each application contains information about an applicant,  age  Marital status  annual salary  outstanding debts  credit rating  etc.  Problem: to decide whether an application should approved, or to classify applications into two categories, approved and not approved.
  • 4. Machine learning and our focus  Like human learning from past experiences.  A computer does not have “experiences”.  A computer system learns from data, which represent some “past experiences” of an application domain.  Our focus: learn a target function that can be used to predict the values of a discrete class attribute, e.g., approve or not-approved, and high-risk or low risk.  The task is commonly called: Supervised learning, classification, or inductive learning.
  • 5.  Data: A set of data records (also called examples, instances or cases) described by  k attributes: A1, A2, … Ak.  a class: Each example is labelled with a pre-defined class.  Goal: To learn a classification model from the data that can be used to predict the classes of new (future, or test) cases/instances. The data and the goal
  • 6. An example: data (loan application) Approved or not
  • 7. An example: the learning task  Learn a classification model from the data  Use the model to classify future loan applications into  Yes (approved) and  No (not approved)  What is the class for following case/instance?
  • 8. Supervised vs. unsupervised Learning  Supervised learning: classification is seen as supervised learning from examples.  Supervision: The data (observations, measurements, etc.) are labeled with pre-defined classes. It is like that a “teacher” gives the classes (supervision).  Test data are classified into these classes too.  Unsupervised learning (clustering)  Class labels of the data are unknown  Given a set of data, the task is to establish the existence of classes or clusters in the data
  • 9. Supervised learning process: two steps Learning (training): Learn a model using the training data Testing: Test the model using unseen test data to assess the model accuracy , casestestofnumberTotal tionsclassificacorrectofNumber Accuracy
  • 10. What do we mean by learning?  Given  a data set D,  a task T, and  a performance measure M, a computer system is said to learn from D to perform the task T if after learning the system’s performance on T improves as measured by M.  In other words, the learned model helps the system to perform T better as compared to no learning.
  • 11. An example  Data: Loan application data  Task: Predict whether a loan should be approved or not.  Performance measure: accuracy. No learning: classify all future applications (test data) to the majority class (i.e., Yes): Accuracy = 9/15 = 60%.  We can do better than 60% with learning.
  • 12. Fundamental assumption of learning Assumption: The distribution of training examples is identical to the distribution of test examples (including future unseen examples).  In practice, this assumption is often violated to certain degree.  Strong violations will clearly result in poor classification accuracy.  To achieve good accuracy on the test data, training examples must be sufficiently representative of the test data.
  • 13. Decision trees  Decision tree learning is one of the most widely used techniques for classification.  Its classification accuracy is competitive with other methods, and  it is very efficient.  The classification model is a tree, called decision tree.  C4.5 by Ross Quinlan is perhaps the best known system
  • 14. The loan data (reproduced) Approved or not
  • 15. A decision tree from the loan data Decision nodes and leaf nodes (classes)
  • 16. Use the decision tree No
  • 17. Evaluating classification methods  Predictive accuracy  Efficiency  time to construct the model  time to use the model  Robustness: handling noise and missing values  Scalability: efficiency in disk-resident databases  Interpretability:  understandable and insight provided by the model  Compactness of the model: size of the tree, or the number of rules.
  • 18. Evaluation methods  Holdout set: The available data set D is divided into two disjoint subsets,  the training set Dtrain (for learning a model)  the test set Dtest (for testing the model)  Important: training set should not be used in testing and the test set should not be used in learning.  Unseen test set provides a unbiased estimate of accuracy.  The test set is also called the holdout set. (the examples in the original data set D are all labeled with classes.)  This method is mainly used when the data set D is large.
  • 19. Evaluation methods (cont…)  n-fold cross-validation: The available data is partitioned into n equal-size disjoint subsets.  Use each subset as the test set and combine the rest n-1 subsets as the training set to learn a classifier.  The procedure is run n times, which give n accuracies.  The final estimated accuracy of learning is the average of the n accuracies.  10-fold and 5-fold cross-validations are commonly used.  This method is used when the available data is not large.
  • 20. CS583, Bing Liu, UIC20 Evaluation methods (cont…)  Leave-one-out cross-validation: This method is used when the data set is very small.  It is a special case of cross-validation  Each fold of the cross validation has only a single test example and all the rest of the data is used in training.  If the original data has m examples, this is m-fold cross- validation
  • 21. CS583, Bing Liu, UIC21 Evaluation methods (cont…)  Validation set: the available data is divided into three subsets,  a training set,  a validation set and  a test set.  A validation set is used frequently for estimating parameters in learning algorithms.  In such cases, the values that give the best accuracy on the validation set are used as the final parameter values.  Cross-validation can be used for parameter estimating as well.
  • 22. CS583, Bing Liu, UIC22 Classification measures  Accuracy is only one measure (error = 1-accuracy).  Accuracy is not suitable in some applications.  In classification involving skewed or highly imbalanced data, e.g., network intrusion and financial fraud detections, we are interested only in the minority class.  High accuracy does not mean any intrusion is detected.  E.g., 1% intrusion. Achieve 99% accuracy by doing nothing.  The class of interest is commonly called the positive class, and the rest negative classes.
  • 23. Back propagation  Backpropagation is a systematic method for training multiple-layer ANNs.  The standard backpropagation algorithm is as follows : 1. Build a network with a chosen number of input, hidden and output units. 2. Initialize all the weights to low random values 3. Choose a single training pair at random 4. Copy the input pattern to the input layer 5. Cycle the network so that the activations from the inputs generate the activations in the hidden and output layers 6. Calculate the error derivative between the output activation and the target output. 7. Backpropagate the summed products of the weights and errors in the output layer in order to calculate the error in the hidden units. 8. Update the weights attached to each unit according to the error in that unit, the output from the unit below it, and the learning parameters, until the error is sufficiently low or the network settles.
  • 24. CS583, Bing Liu, UIC24 Conclusion  Applications of supervised learning are in almost any field or domain.  There are numerous classification techniques.  There are still many other methods, e.g.,  Bayesian networks  Neural networks  Genetic algorithms  Fuzzy classification This large number of methods also show the importance of classification and its wide applicability.  It remains to be an active research area.
  • 25. References :  Artificial Intelligence and Intelligent systems by N.P. Padhy  Presentation on Supervised Learning by Bing Liu.  Artificial Neural Networks for Beginners by Carlos Gershenson  Computational Intelligence : An Introduction by Andries P. Engelbrecht