SlideShare une entreprise Scribd logo
1  sur  30
Introduction to Classification
Machine Learning and Data Mining (Unit 10)




Prof. Pier Luca Lanzi
References                                    2



  Jiawei Han and Micheline Kamber, "Data Mining, : Concepts
  and Techniques", The Morgan Kaufmann Series in Data
  Management Systems (Second Edition).
  Tom M. Mitchell. “Machine Learning” McGraw Hill 1997.
  Pang-Ning Tan, Michael Steinbach, Vipin Kumar,
  “Introduction to Data Mining”, Addison Wesley.




                  Prof. Pier Luca Lanzi
What is an apple?                           3




                    Prof. Pier Luca Lanzi
Are these apples?                           4




                    Prof. Pier Luca Lanzi
Supervised vs. Unsupervised Learning           5



  Unsupervised learning (clustering)
    The class labels of training data is unknown
    Given a set of measurements, observations, etc. with the
    aim of establishing the existence of classes or clusters in
    the data
  Supervised learning (classification)
    Supervision: The training data (observations,
    measurements, etc.) are accompanied by labels indicating
    the class of the observations
    New data is classified based on the training set




                  Prof. Pier Luca Lanzi
The contact lenses data                                                   6


     Age         Spectacle prescription      Astigmatism   Tear production rate   Recommended
                                                                                      lenses
    Young              Myope                      No            Reduced                None
    Young              Myope                      No             Normal                Soft
    Young              Myope                      Yes           Reduced                None
    Young              Myope                      Yes            Normal                Hard
    Young           Hypermetrope                  No            Reduced                None
    Young           Hypermetrope                  No             Normal                Soft
    Young           Hypermetrope                  Yes           Reduced                None
    Young           Hypermetrope                  Yes            Normal                hard
Pre-presbyopic         Myope                      No            Reduced                None
Pre-presbyopic         Myope                      No             Normal                Soft
Pre-presbyopic         Myope                      Yes           Reduced                None
Pre-presbyopic         Myope                      Yes            Normal                Hard
Pre-presbyopic      Hypermetrope                  No            Reduced                None
Pre-presbyopic      Hypermetrope                  No             Normal                Soft
Pre-presbyopic      Hypermetrope                  Yes           Reduced                None
Pre-presbyopic      Hypermetrope                  Yes            Normal                None
  Presbyopic           Myope                      No            Reduced                None
  Presbyopic           Myope                      No             Normal                None
  Presbyopic           Myope                      Yes           Reduced                None
  Presbyopic           Myope                      Yes            Normal                Hard
  Presbyopic        Hypermetrope                  No            Reduced                None
  Presbyopic        Hypermetrope                  No             Normal                Soft
  Presbyopic        Hypermetrope                  Yes           Reduced                None
  Presbyopic        Hypermetrope                  Yes            Normal                None
                                Prof. Pier Luca Lanzi
A model extracted from                               7
the contact lenses data

  If tear production rate = reduced then recommendation = none
  If age = young and astigmatic = no
     and tear production rate = normal then recommendation = soft
  If age = pre-presbyopic and astigmatic = no
     and tear production rate = normal then recommendation = soft
  If age = presbyopic and spectacle prescription = myope
     and astigmatic = no then recommendation = none
  If spectacle prescription = hypermetrope and astigmatic = no
     and tear production rate = normal then recommendation = soft
  If spectacle prescription = myope and astigmatic = yes
     and tear production rate = normal then recommendation = hard
  If age young and astigmatic = yes
     and tear production rate = normal then recommendation = hard
  If age = pre-presbyopic
     and spectacle prescription = hypermetrope
     and astigmatic = yes then recommendation = none
  If age = presbyopic and spectacle prescription = hypermetrope
     and astigmatic = yes then recommendation = none




                     Prof. Pier Luca Lanzi
Predicting CPU performance                                              8



      209 different computer configurations

       Cycle time   Main memory         Cache            Channels           Performance
          (ns)          (Kb)             (Kb)
         MYCT       MMIN    MMAX         CACH      CHMIN      CHMAX            PRP
  1       125       256      6000         256       16          128            198
  2       29        8000    32000          32        8          32             269
 …
208       480       512      8000          32        0              0           67
209       480       1000     4000          0         0              0           45




      Amodel to predict the performance
      PRP = -55.9 + 0.0489 MYCT + 0.0153 MMIN + 0.0056 MMAX
            + 0.6410 CACH - 0.2700 CHMIN + 1.480 CHMAX


                           Prof. Pier Luca Lanzi
Classification vs. Prediction                     9



  Classification
     predicts categorical class labels (discrete or nominal)
     classifies data (constructs a model) based on the training
     set and the values (class labels) in a classifying attribute
     and uses it in classifying new data

  Prediction
     models continuous-valued functions, i.e., predicts
     unknown or missing values

  Applications
    Credit approval
    Target marketing
    Medical diagnosis
    Fraud detection

                   Prof. Pier Luca Lanzi
What is classification?                         10



  It is a two-step Process
  Model construction
      Given a set of data representing examples of
      a target concept, build a model to “explain” the concept
  Model usage
      The classification model is used for classifying
      future or unknown cases
      Estimate accuracy of the model




                   Prof. Pier Luca Lanzi
Classification: Model Construction             11




                                              Classification
                                               Algorithms
              Training
               Data


                                                Classifier
NAM E   RANK           YEARS TENURED
                                                (Model)
M ike   Assistant Prof   3      no
M ary   Assistant Prof   7      yes
Bill    Professor        2      yes
Jim     Associate Prof   7      yes
                                           IF rank = ‘professor’
Dave    Assistant Prof   6      no
                                           OR years > 6
Anne    Associate Prof   3      no
                                           THEN tenured = ‘yes’
                   Prof. Pier Luca Lanzi
Classification: Model Usage                                  12




                                             Classifier


                 Testing
                                                            Unseen Data
                  Data

                                                          (Jeff, Professor, 4)
NAM E      RANK           YEARS TENURED
                                                          Tenured?
Tom        Assistant Prof   2      no
M erlisa   Associate Prof   7      no
George     Professor        5      yes
Joseph     Assistant Prof   7      yes
                     Prof. Pier Luca Lanzi
Illustrating Classification Task                                           13




                Attrib1    Attrib2   Attrib3   Class
          Tid
                                               No
          1     Yes       Large      125K
                                               No
          2     No        Medium     100K
                                               No
          3     No        Small      70K
                                               No
          4     Yes       Medium     120K
                                               Yes
          5     No        Large      95K
                                               No
          6     No        Medium     60K
                                                                   Learn
                                               No
          7     Yes       Large      220K

                                                                   Model
                                               Yes
          8     No        Small      85K
                                               No
          9     No        Medium     75K
                                               Yes
          10    No        Small      90K
     10




                                                                   Apply
                                                                   Model
                Attrib1    Attrib2   Attrib3   Class
          Tid
                                               ?
          11    No        Small      55K
                                               ?
          12    Yes       Medium     80K
                                               ?
          13    Yes       Large      110K
                                               ?
          14    No        Small      95K
                                               ?
          15    No        Large      67K
     10




                                           Prof. Pier Luca Lanzi
Evaluating Classification Methods              14



  Accuracy
     classifier accuracy: predicting class label
     predictor accuracy: guessing value of predicted attributes
  Speed
     time to construct the model (training time)
     time to use the model (classification/prediction time)
  Robustness: handling noise and missing values
  Scalability: efficiency in disk-resident databases
  Interpretability
     understanding and insight provided by the model
  Other measures, e.g., goodness of rules, such as decision
  tree size or compactness of classification rules




                   Prof. Pier Luca Lanzi
Machine Learning perspective                   15
on classification

  Classification algorithms are methods of supervised Learning

  In Supervised Learning
     The experience E consists of a set of examples of a target
     concept that have been prepared by a supervisor
     The task T consists of finding an hypothesis that
     accurately explains the target concept
     The performance P depends on how accurately the
     hypothesis h explains the examples in E




                  Prof. Pier Luca Lanzi
Machine Learning perspective                   16
on classification

  Let us define the problem domain as the set of instance X

  For instance, X contains different fruits

  We define a concept over X as a function c which maps
  elements of X into a range D

                                  c:X→ D

  The range D represents the type of concept that is going to
  be analyzed

  For instance, c: X → {apple, not_an_apple}




                   Prof. Pier Luca Lanzi
Machine Learning perspective                         17
on classification

  Experience E is a set of <x,d> pairs, with x∈X and d∈D.
  The task T consists of finding an hypothesis h to explain E:

                              ∀x∈X h(x)=c(x)

  The set H of all the possible hypotheses h that can be used to
  explain c it is called the hypothesis space

  The goodness of an hypothesis h can be evaluated as the
  percentage of examples that are correctly explained by h

                P(h) = | {x| x∈X e h(x)=c(x)}| / |X|




                     Prof. Pier Luca Lanzi
Examples                                         18



  Concept Learning
  when D={0,1}

  Supervised classification
  when D consists of a finite number of labels

  Prediction
  when D is a subset of Rn




                   Prof. Pier Luca Lanzi
Machine Learning perspective                  19
on classification

  Supervised learning algorithms, given the examples in E,
  search the hypotheses space H for the hypothesis h that best
  explains the examples in E
  Learning is viewed as a search in the hypotheses space




                  Prof. Pier Luca Lanzi
Searching for the hypothesis                    20



  The type of hypothesis required influences the search
  algorithm

  The more complex the representation
  the more complex the search algorithm

  Many algorithms assume that it is possible to define a partial
  ordering over the hypothesis space

  The hypothesis space can be searched using either a general
  to specific or a specific-to-general strategy




                   Prof. Pier Luca Lanzi
Exploring the Hypothesis Space                21



  General to Specific
    Start with the most general hypothesis and then go on
    through specialization steps

  Specific to General
    Start with the set of the most specific hypothesis and then
    go on through generalization steps




                  Prof. Pier Luca Lanzi
Inductive Bias                                   22



  Set of assumptions that together with the training data
  deductively justify the classification assigned by the learner
  to future instances




                   Prof. Pier Luca Lanzi
Inductive Bias                                   23



  Set of assumptions that together with the training data
  deductively justify the classification assigned by the learner
  to future instances

  There can be a number of hypotheses consistent with
  training data

  Each learning algorithm has an inductive bias that imposes a
  preference on the space of all possible hypotheses




                   Prof. Pier Luca Lanzi
Types of Inductive Bias                          24



  Syntactic Bias, depends on the language used to represent
  hypotheses

  Semantic Bias, depends on the heuristics used to filter
  hypotheses

  Preference Bias, depends on the ability to rank and compare
  hypotheses

  Restriction Bias, depends on the ability to restrict the search
  space




                   Prof. Pier Luca Lanzi
Why looking for h?                              25



  Inductive Learning Hypothesis: any hypothesis (h) found to
  approximate the target function (c) over a sufficiently large
  set of training examples will also approximate the target
  function (c) well over other unobserved examples.




                   Prof. Pier Luca Lanzi
Trainining and testing                          26



  Training: the hypothesis h is developed to explain the
  examples in Etrain
  Testing: the hypothesis h is evaluated (verified) with respect
  to the previously unseen examples in Etest




                   Prof. Pier Luca Lanzi
Generalization and Overfitting                    27



  The hypothesis h is developed based on a set of training
  examples Etrain
  The underlying hypothesis is that if h explains Etrain then it
  can also be used to explain other examples in Etest not
  previously used to develop h




                    Prof. Pier Luca Lanzi
Generalization and Overfitting                       28



  When h explains “well” both Etrain and Etest we say that h is general
  and that the method used to develop h has adequately generalized
  When h explains Etrain but not Etest we say that the method used to
  develop h has overfitted
  We have overfitting when the hypothesis h explains Etrain too
  accurately so that h is not general enough to be applied outside
  Etrain




                     Prof. Pier Luca Lanzi
What are the general issues                 29
for classification in Machine Learning?

  Type of training experience
     Direct or indirect?
     Supervised or not?
  Type of target function and performance
  Type of search algorithm
  Type of representation of the solution
  Type of Inductive bias




                  Prof. Pier Luca Lanzi
Summary                                         30



 Classification is a two-step process involving the building,
 the testing, and the usage of the classification model
 Major issues for Data Mining include:
    The type of input data
    The representation used for the model
    The generalization performance on unseen data
 In Machine Learning, classification is viewed as
 an instance of supervised learning
 The focus is on the search process aimed at finding the
 classifier (the hypothesis) that best explains the data
 Major issues for Machine Learning include:
    The type of input experience
    The search algorithm
    The inductive biases


                  Prof. Pier Luca Lanzi

Contenu connexe

Tendances

Gaussian Mixture Models
Gaussian Mixture ModelsGaussian Mixture Models
Gaussian Mixture Models
guestfee8698
 
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Simplilearn
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
butest
 

Tendances (20)

Dbscan algorithom
Dbscan algorithomDbscan algorithom
Dbscan algorithom
 
Perceptron in ANN
Perceptron in ANNPerceptron in ANN
Perceptron in ANN
 
Machine-Learning-A-Z-Course-Downloadable-Slides-V1.5.pdf
Machine-Learning-A-Z-Course-Downloadable-Slides-V1.5.pdfMachine-Learning-A-Z-Course-Downloadable-Slides-V1.5.pdf
Machine-Learning-A-Z-Course-Downloadable-Slides-V1.5.pdf
 
Les 10 plus populaires algorithmes du machine learning
Les 10 plus populaires algorithmes du machine learningLes 10 plus populaires algorithmes du machine learning
Les 10 plus populaires algorithmes du machine learning
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
 
Notes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgNotes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew Ng
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
Perceptron & Neural Networks
Perceptron & Neural NetworksPerceptron & Neural Networks
Perceptron & Neural Networks
 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
 
Gaussian Mixture Models
Gaussian Mixture ModelsGaussian Mixture Models
Gaussian Mixture Models
 
Machine learning
Machine learningMachine learning
Machine learning
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
 
Support Vector Machines for Classification
Support Vector Machines for ClassificationSupport Vector Machines for Classification
Support Vector Machines for Classification
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning Algorithms
 

Plus de Pier Luca Lanzi

Plus de Pier Luca Lanzi (20)

11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi11 Settembre 2021 - Giocare con i Videogiochi
11 Settembre 2021 - Giocare con i Videogiochi
 
Breve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei VideogiochiBreve Viaggio al Centro dei Videogiochi
Breve Viaggio al Centro dei Videogiochi
 
Global Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning WelcomeGlobal Game Jam 19 @ POLIMI - Morning Welcome
Global Game Jam 19 @ POLIMI - Morning Welcome
 
Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018Data Driven Game Design @ Campus Party 2018
Data Driven Game Design @ Campus Party 2018
 
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
GGJ18 al Politecnico di Milano - Presentazione che precede la presentazione d...
 
GGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di aperturaGGJ18 al Politecnico di Milano - Presentazione di apertura
GGJ18 al Politecnico di Milano - Presentazione di apertura
 
Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018Presentation for UNITECH event - January 8, 2018
Presentation for UNITECH event - January 8, 2018
 
DMTM Lecture 20 Data preparation
DMTM Lecture 20 Data preparationDMTM Lecture 20 Data preparation
DMTM Lecture 20 Data preparation
 
DMTM Lecture 19 Data exploration
DMTM Lecture 19 Data explorationDMTM Lecture 19 Data exploration
DMTM Lecture 19 Data exploration
 
DMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph miningDMTM Lecture 18 Graph mining
DMTM Lecture 18 Graph mining
 
DMTM Lecture 17 Text mining
DMTM Lecture 17 Text miningDMTM Lecture 17 Text mining
DMTM Lecture 17 Text mining
 
DMTM Lecture 16 Association rules
DMTM Lecture 16 Association rulesDMTM Lecture 16 Association rules
DMTM Lecture 16 Association rules
 
DMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluationDMTM Lecture 15 Clustering evaluation
DMTM Lecture 15 Clustering evaluation
 
DMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clusteringDMTM Lecture 14 Density based clustering
DMTM Lecture 14 Density based clustering
 
DMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clusteringDMTM Lecture 13 Representative based clustering
DMTM Lecture 13 Representative based clustering
 
DMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clusteringDMTM Lecture 12 Hierarchical clustering
DMTM Lecture 12 Hierarchical clustering
 
DMTM Lecture 11 Clustering
DMTM Lecture 11 ClusteringDMTM Lecture 11 Clustering
DMTM Lecture 11 Clustering
 
DMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensemblesDMTM Lecture 10 Classification ensembles
DMTM Lecture 10 Classification ensembles
 
DMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethodsDMTM Lecture 09 Other classificationmethods
DMTM Lecture 09 Other classificationmethods
 
DMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rulesDMTM Lecture 08 Classification rules
DMTM Lecture 08 Classification rules
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 

Machine Learning and Data Mining: 10 Introduction to Classification

  • 1. Introduction to Classification Machine Learning and Data Mining (Unit 10) Prof. Pier Luca Lanzi
  • 2. References 2 Jiawei Han and Micheline Kamber, "Data Mining, : Concepts and Techniques", The Morgan Kaufmann Series in Data Management Systems (Second Edition). Tom M. Mitchell. “Machine Learning” McGraw Hill 1997. Pang-Ning Tan, Michael Steinbach, Vipin Kumar, “Introduction to Data Mining”, Addison Wesley. Prof. Pier Luca Lanzi
  • 3. What is an apple? 3 Prof. Pier Luca Lanzi
  • 4. Are these apples? 4 Prof. Pier Luca Lanzi
  • 5. Supervised vs. Unsupervised Learning 5 Unsupervised learning (clustering) The class labels of training data is unknown Given a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data Supervised learning (classification) Supervision: The training data (observations, measurements, etc.) are accompanied by labels indicating the class of the observations New data is classified based on the training set Prof. Pier Luca Lanzi
  • 6. The contact lenses data 6 Age Spectacle prescription Astigmatism Tear production rate Recommended lenses Young Myope No Reduced None Young Myope No Normal Soft Young Myope Yes Reduced None Young Myope Yes Normal Hard Young Hypermetrope No Reduced None Young Hypermetrope No Normal Soft Young Hypermetrope Yes Reduced None Young Hypermetrope Yes Normal hard Pre-presbyopic Myope No Reduced None Pre-presbyopic Myope No Normal Soft Pre-presbyopic Myope Yes Reduced None Pre-presbyopic Myope Yes Normal Hard Pre-presbyopic Hypermetrope No Reduced None Pre-presbyopic Hypermetrope No Normal Soft Pre-presbyopic Hypermetrope Yes Reduced None Pre-presbyopic Hypermetrope Yes Normal None Presbyopic Myope No Reduced None Presbyopic Myope No Normal None Presbyopic Myope Yes Reduced None Presbyopic Myope Yes Normal Hard Presbyopic Hypermetrope No Reduced None Presbyopic Hypermetrope No Normal Soft Presbyopic Hypermetrope Yes Reduced None Presbyopic Hypermetrope Yes Normal None Prof. Pier Luca Lanzi
  • 7. A model extracted from 7 the contact lenses data If tear production rate = reduced then recommendation = none If age = young and astigmatic = no and tear production rate = normal then recommendation = soft If age = pre-presbyopic and astigmatic = no and tear production rate = normal then recommendation = soft If age = presbyopic and spectacle prescription = myope and astigmatic = no then recommendation = none If spectacle prescription = hypermetrope and astigmatic = no and tear production rate = normal then recommendation = soft If spectacle prescription = myope and astigmatic = yes and tear production rate = normal then recommendation = hard If age young and astigmatic = yes and tear production rate = normal then recommendation = hard If age = pre-presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none If age = presbyopic and spectacle prescription = hypermetrope and astigmatic = yes then recommendation = none Prof. Pier Luca Lanzi
  • 8. Predicting CPU performance 8 209 different computer configurations Cycle time Main memory Cache Channels Performance (ns) (Kb) (Kb) MYCT MMIN MMAX CACH CHMIN CHMAX PRP 1 125 256 6000 256 16 128 198 2 29 8000 32000 32 8 32 269 … 208 480 512 8000 32 0 0 67 209 480 1000 4000 0 0 0 45 Amodel to predict the performance PRP = -55.9 + 0.0489 MYCT + 0.0153 MMIN + 0.0056 MMAX + 0.6410 CACH - 0.2700 CHMIN + 1.480 CHMAX Prof. Pier Luca Lanzi
  • 9. Classification vs. Prediction 9 Classification predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values (class labels) in a classifying attribute and uses it in classifying new data Prediction models continuous-valued functions, i.e., predicts unknown or missing values Applications Credit approval Target marketing Medical diagnosis Fraud detection Prof. Pier Luca Lanzi
  • 10. What is classification? 10 It is a two-step Process Model construction Given a set of data representing examples of a target concept, build a model to “explain” the concept Model usage The classification model is used for classifying future or unknown cases Estimate accuracy of the model Prof. Pier Luca Lanzi
  • 11. Classification: Model Construction 11 Classification Algorithms Training Data Classifier NAM E RANK YEARS TENURED (Model) M ike Assistant Prof 3 no M ary Assistant Prof 7 yes Bill Professor 2 yes Jim Associate Prof 7 yes IF rank = ‘professor’ Dave Assistant Prof 6 no OR years > 6 Anne Associate Prof 3 no THEN tenured = ‘yes’ Prof. Pier Luca Lanzi
  • 12. Classification: Model Usage 12 Classifier Testing Unseen Data Data (Jeff, Professor, 4) NAM E RANK YEARS TENURED Tenured? Tom Assistant Prof 2 no M erlisa Associate Prof 7 no George Professor 5 yes Joseph Assistant Prof 7 yes Prof. Pier Luca Lanzi
  • 13. Illustrating Classification Task 13 Attrib1 Attrib2 Attrib3 Class Tid No 1 Yes Large 125K No 2 No Medium 100K No 3 No Small 70K No 4 Yes Medium 120K Yes 5 No Large 95K No 6 No Medium 60K Learn No 7 Yes Large 220K Model Yes 8 No Small 85K No 9 No Medium 75K Yes 10 No Small 90K 10 Apply Model Attrib1 Attrib2 Attrib3 Class Tid ? 11 No Small 55K ? 12 Yes Medium 80K ? 13 Yes Large 110K ? 14 No Small 95K ? 15 No Large 67K 10 Prof. Pier Luca Lanzi
  • 14. Evaluating Classification Methods 14 Accuracy classifier accuracy: predicting class label predictor accuracy: guessing value of predicted attributes Speed time to construct the model (training time) time to use the model (classification/prediction time) Robustness: handling noise and missing values Scalability: efficiency in disk-resident databases Interpretability understanding and insight provided by the model Other measures, e.g., goodness of rules, such as decision tree size or compactness of classification rules Prof. Pier Luca Lanzi
  • 15. Machine Learning perspective 15 on classification Classification algorithms are methods of supervised Learning In Supervised Learning The experience E consists of a set of examples of a target concept that have been prepared by a supervisor The task T consists of finding an hypothesis that accurately explains the target concept The performance P depends on how accurately the hypothesis h explains the examples in E Prof. Pier Luca Lanzi
  • 16. Machine Learning perspective 16 on classification Let us define the problem domain as the set of instance X For instance, X contains different fruits We define a concept over X as a function c which maps elements of X into a range D c:X→ D The range D represents the type of concept that is going to be analyzed For instance, c: X → {apple, not_an_apple} Prof. Pier Luca Lanzi
  • 17. Machine Learning perspective 17 on classification Experience E is a set of <x,d> pairs, with x∈X and d∈D. The task T consists of finding an hypothesis h to explain E: ∀x∈X h(x)=c(x) The set H of all the possible hypotheses h that can be used to explain c it is called the hypothesis space The goodness of an hypothesis h can be evaluated as the percentage of examples that are correctly explained by h P(h) = | {x| x∈X e h(x)=c(x)}| / |X| Prof. Pier Luca Lanzi
  • 18. Examples 18 Concept Learning when D={0,1} Supervised classification when D consists of a finite number of labels Prediction when D is a subset of Rn Prof. Pier Luca Lanzi
  • 19. Machine Learning perspective 19 on classification Supervised learning algorithms, given the examples in E, search the hypotheses space H for the hypothesis h that best explains the examples in E Learning is viewed as a search in the hypotheses space Prof. Pier Luca Lanzi
  • 20. Searching for the hypothesis 20 The type of hypothesis required influences the search algorithm The more complex the representation the more complex the search algorithm Many algorithms assume that it is possible to define a partial ordering over the hypothesis space The hypothesis space can be searched using either a general to specific or a specific-to-general strategy Prof. Pier Luca Lanzi
  • 21. Exploring the Hypothesis Space 21 General to Specific Start with the most general hypothesis and then go on through specialization steps Specific to General Start with the set of the most specific hypothesis and then go on through generalization steps Prof. Pier Luca Lanzi
  • 22. Inductive Bias 22 Set of assumptions that together with the training data deductively justify the classification assigned by the learner to future instances Prof. Pier Luca Lanzi
  • 23. Inductive Bias 23 Set of assumptions that together with the training data deductively justify the classification assigned by the learner to future instances There can be a number of hypotheses consistent with training data Each learning algorithm has an inductive bias that imposes a preference on the space of all possible hypotheses Prof. Pier Luca Lanzi
  • 24. Types of Inductive Bias 24 Syntactic Bias, depends on the language used to represent hypotheses Semantic Bias, depends on the heuristics used to filter hypotheses Preference Bias, depends on the ability to rank and compare hypotheses Restriction Bias, depends on the ability to restrict the search space Prof. Pier Luca Lanzi
  • 25. Why looking for h? 25 Inductive Learning Hypothesis: any hypothesis (h) found to approximate the target function (c) over a sufficiently large set of training examples will also approximate the target function (c) well over other unobserved examples. Prof. Pier Luca Lanzi
  • 26. Trainining and testing 26 Training: the hypothesis h is developed to explain the examples in Etrain Testing: the hypothesis h is evaluated (verified) with respect to the previously unseen examples in Etest Prof. Pier Luca Lanzi
  • 27. Generalization and Overfitting 27 The hypothesis h is developed based on a set of training examples Etrain The underlying hypothesis is that if h explains Etrain then it can also be used to explain other examples in Etest not previously used to develop h Prof. Pier Luca Lanzi
  • 28. Generalization and Overfitting 28 When h explains “well” both Etrain and Etest we say that h is general and that the method used to develop h has adequately generalized When h explains Etrain but not Etest we say that the method used to develop h has overfitted We have overfitting when the hypothesis h explains Etrain too accurately so that h is not general enough to be applied outside Etrain Prof. Pier Luca Lanzi
  • 29. What are the general issues 29 for classification in Machine Learning? Type of training experience Direct or indirect? Supervised or not? Type of target function and performance Type of search algorithm Type of representation of the solution Type of Inductive bias Prof. Pier Luca Lanzi
  • 30. Summary 30 Classification is a two-step process involving the building, the testing, and the usage of the classification model Major issues for Data Mining include: The type of input data The representation used for the model The generalization performance on unseen data In Machine Learning, classification is viewed as an instance of supervised learning The focus is on the search process aimed at finding the classifier (the hypothesis) that best explains the data Major issues for Machine Learning include: The type of input experience The search algorithm The inductive biases Prof. Pier Luca Lanzi