SlideShare une entreprise Scribd logo
1  sur  13
Télécharger pour lire hors ligne
Genetic-based Synthetic Data
  Sets for the A l i f
  S t f th Analysis of
    Classifiers Behavior
 8th I t
     International Conference on Hybrid Intelligent Systems
            ti   lC f            H b id I t lli   tS t



              Núria Macià
          Albert Orriols-Puig
          Alb t O i l P i
       Ester Bernadó-Mansilla
 {nmacia,aorriols,esterb}@salle.url.edu

Grup de Recerca en Sistemes Intel·ligents
    Enginyeria i Arquitectura La Salle
        Universitat Ramon Llull
Motivation

                                    Knowledge
     Data Set                                        Model
                                    Extraction
     Real-world
                                         Learner
      problem
                                                       +
                                                   Prediction
   Necessity of synthetic data sets
      To evaluate real learners performance under
      controlled scenarios
   How to generate synthetic data sets?
      Data complexity (Ho & Basu, 2002)
      Length of the class boundary (Macià et al., 2008)

Objective:        Set of benchmark problems to analyze
                  learners behavior
          Overview and Future Research                          Slide 2
Outline
1.
1    Data complexity
2.   Synthetic data sets
3.   Design of GA
4.
4    Experiments and results
5.   Conclusions and further work




         Overview and Future Research   Slide 3
1. Data complexity
 Length of the class boundary
    Build minimum spanning tree (MST) connecting all
    the points regardless of class
    Count the number of edges joining
    opposite classes
          it l

 Two cases of many points in boundary:
    Very interleaved or random data
    Linearly separable problem with narrow margins




       Overview and Future Research                  Slide 4
2. Synthetic data sets
 Generation procedure
   Set the number of instances n, the number of
   attributes m and the length of the class boundary
              m,
   b.
   Generate n points di t ib t d randomly and b ild
   G        t    i t distributed     dl       d build
   the MST.

   Label the class of each
   instances




     Overview and Future Research                 Slide 5
2. Synthetic data sets
 Exhaustive search
   Labelings grow exponentially with the number of
   instances
 Heuristic search
   Demanded length of the class boundary is not
   always achieved
   No diverse solutions



                   Genetic algorithm
                   G   ti l     ith


     Overview and Future Research              Slide 6
3. Design of GA
 Knowledge representation
     k-ary string where the bit i stores the class label of
     the ith instance


 Data set i                             Individual i
 Att. 1 Att. 2 … Att. N Class
 0.4
 04     0.5
        05       0.4
                 04       0
 0.2    1.0      0.2      1
                                        011011
 0.5    0.3      0.4      1
 0.6    0.5      0.4      0
 0.7    0.1      1.0      1
 0.5    0.3      0.9      1




         Overview and Future Research                     Slide 7
3. Design of GA
 Genetic operators
   s-wise tournament selection
   Two-point crossover
   T       it
   Bit-wise mutation
 Fitness function
   fitnessi = bobj − bi




     Overview and Future Research   Slide 8
4. Experiment and results (I)
  Synthetic data set generation
    Different solutions < Solutions
       Population converge
       Pop lation con erge to the same sol tion
                                         solution
       {0100,1011} are equivalent individuals
    Intermediate complexity are obtained i early
    It     di t      l it        bt i d in    l
    generations




      Overview and Future Research                  Slide 9
4. Experiment and results (II)
  Analysis of classifiers behavior
    Three different paradigms: C4.5, Naïve Bayes, and
    SMO
    Similar accuracy rates with noticeable variability




      Overview and Future Research                  Slide 10
5. Conclusions
 The GA allows us to generate data sets with
 the demanded length of the class boundary




     Overview and Future Research        Slide 11
6. Further work
 Efficiency and scalability
   Move from simple GA to competent GA
 Capacity of satisfying multiple criteria
 C         f      f
   Multi-objective strategy
           j             gy
 Achieve structure of real-world problems
   Provide a set of benchmark problems




     Overview and Future Research           Slide 12
Genetic-based Synthetic Data
  Sets for the A l i f
  S t f th Analysis of
    Classifiers Behavior
 8th I t
     International Conference on Hybrid Intelligent Systems
            ti   lC f            H b id I t lli   tS t



              Núria Macià
          Albert Orriols-Puig
          Alb t O i l P i
       Ester Bernadó-Mansilla
 {nmacia,aorriols,esterb}@salle.url.edu

Grup de Recerca en Sistemes Intel·ligents
    Enginyeria i Arquitectura La Salle
        Universitat Ramon Llull

Contenu connexe

Tendances

CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...Albert Orriols-Puig
 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IIAlbert Orriols-Puig
 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSAlbert Orriols-Puig
 
A New Model for Credit Approval Problems a Neuro Genetic System with Quantum ...
A New Model for Credit Approval Problems a Neuro Genetic System with Quantum ...A New Model for Credit Approval Problems a Neuro Genetic System with Quantum ...
A New Model for Credit Approval Problems a Neuro Genetic System with Quantum ...Anderson Pinho
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 
An Unorthodox View on Memetic Algorithms
An Unorthodox View on Memetic AlgorithmsAn Unorthodox View on Memetic Algorithms
An Unorthodox View on Memetic AlgorithmsNatalio Krasnogor
 

Tendances (20)

Lecture24
Lecture24Lecture24
Lecture24
 
Lecture3 - Machine Learning
Lecture3 - Machine LearningLecture3 - Machine Learning
Lecture3 - Machine Learning
 
Lecture1 - Machine Learning
Lecture1 - Machine LearningLecture1 - Machine Learning
Lecture1 - Machine Learning
 
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
 
Lecture7 - IBk
Lecture7 - IBkLecture7 - IBk
Lecture7 - IBk
 
Lecture11 - neural networks
Lecture11 - neural networksLecture11 - neural networks
Lecture11 - neural networks
 
Lecture17
Lecture17Lecture17
Lecture17
 
Lecture2 - Machine Learning
Lecture2 - Machine LearningLecture2 - Machine Learning
Lecture2 - Machine Learning
 
Lecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART IILecture15 - Advances topics on association rules PART II
Lecture15 - Advances topics on association rules PART II
 
Lecture19
Lecture19Lecture19
Lecture19
 
Lecture18
Lecture18Lecture18
Lecture18
 
Lecture20
Lecture20Lecture20
Lecture20
 
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCSHIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
 
Lecture23
Lecture23Lecture23
Lecture23
 
A New Model for Credit Approval Problems a Neuro Genetic System with Quantum ...
A New Model for Credit Approval Problems a Neuro Genetic System with Quantum ...A New Model for Credit Approval Problems a Neuro Genetic System with Quantum ...
A New Model for Credit Approval Problems a Neuro Genetic System with Quantum ...
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
An Unorthodox View on Memetic Algorithms
An Unorthodox View on Memetic AlgorithmsAn Unorthodox View on Memetic Algorithms
An Unorthodox View on Memetic Algorithms
 
Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)
 
Multimodal Deep Learning
Multimodal Deep LearningMultimodal Deep Learning
Multimodal Deep Learning
 
Neural Architectures for Video Encoding
Neural Architectures for Video EncodingNeural Architectures for Video Encoding
Neural Architectures for Video Encoding
 

Similaire à HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers Behavior

Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...Paolo Missier
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisOlga Scrivner
 
Lec1-Into
Lec1-IntoLec1-Into
Lec1-Intobutest
 
Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Miningbutest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Machine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuMachine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuSeokhyun Yoon
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
A SURVEY ON DATA MINING IN STEEL INDUSTRIES
A SURVEY ON DATA MINING IN STEEL INDUSTRIESA SURVEY ON DATA MINING IN STEEL INDUSTRIES
A SURVEY ON DATA MINING IN STEEL INDUSTRIESIJCSES Journal
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETEditor IJMTER
 
Computational model for artificial learning using formal concept analysis
Computational model for artificial learning using formal concept analysisComputational model for artificial learning using formal concept analysis
Computational model for artificial learning using formal concept analysisAboul Ella Hassanien
 
Selecting the correct Data Mining Method: Classification & InDaMiTe-R
Selecting the correct Data Mining Method: Classification & InDaMiTe-RSelecting the correct Data Mining Method: Classification & InDaMiTe-R
Selecting the correct Data Mining Method: Classification & InDaMiTe-RIOSR Journals
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsIJERA Editor
 
Neural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance IndustryNeural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance IndustryInderjeet Singh
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxVenkateswaraBabuRavi
 
Classifier Model using Artificial Neural Network
Classifier Model using Artificial Neural NetworkClassifier Model using Artificial Neural Network
Classifier Model using Artificial Neural NetworkAI Publications
 
Chapter1_C.doc
Chapter1_C.docChapter1_C.doc
Chapter1_C.docbutest
 
SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.bhavinecindus
 

Similaire à HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers Behavior (20)

Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
 
Lec1-Into
Lec1-IntoLec1-Into
Lec1-Into
 
Presentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data MiningPresentation on Machine Learning and Data Mining
Presentation on Machine Learning and Data Mining
 
Clusterix at VDS 2016
Clusterix at VDS 2016Clusterix at VDS 2016
Clusterix at VDS 2016
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dkuMachine learning and_neural_network_lecture_slide_ece_dku
Machine learning and_neural_network_lecture_slide_ece_dku
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
A SURVEY ON DATA MINING IN STEEL INDUSTRIES
A SURVEY ON DATA MINING IN STEEL INDUSTRIESA SURVEY ON DATA MINING IN STEEL INDUSTRIES
A SURVEY ON DATA MINING IN STEEL INDUSTRIES
 
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASETSURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
 
Computational model for artificial learning using formal concept analysis
Computational model for artificial learning using formal concept analysisComputational model for artificial learning using formal concept analysis
Computational model for artificial learning using formal concept analysis
 
Selecting the correct Data Mining Method: Classification & InDaMiTe-R
Selecting the correct Data Mining Method: Classification & InDaMiTe-RSelecting the correct Data Mining Method: Classification & InDaMiTe-R
Selecting the correct Data Mining Method: Classification & InDaMiTe-R
 
Hypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining AlgorithmsHypothesis on Different Data Mining Algorithms
Hypothesis on Different Data Mining Algorithms
 
Neural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance IndustryNeural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance Industry
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
Classifier Model using Artificial Neural Network
Classifier Model using Artificial Neural NetworkClassifier Model using Artificial Neural Network
Classifier Model using Artificial Neural Network
 
Chapter1_C.doc
Chapter1_C.docChapter1_C.doc
Chapter1_C.doc
 
SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.SYNOPSIS on Parse representation and Linear SVM.
SYNOPSIS on Parse representation and Linear SVM.
 
Kenett On Information NYU-Poly 2013
Kenett On Information NYU-Poly 2013Kenett On Information NYU-Poly 2013
Kenett On Information NYU-Poly 2013
 

Plus de Albert Orriols-Puig

Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceAlbert Orriols-Puig
 
HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsAlbert Orriols-Puig
 
Lecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIILecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIIAlbert Orriols-Puig
 
Lecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesLecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesAlbert Orriols-Puig
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryAlbert Orriols-Puig
 
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...Albert Orriols-Puig
 

Plus de Albert Orriols-Puig (11)

Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligence
 
HAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasetsHAIS09-BeyondHomemadeArtificialDatasets
HAIS09-BeyondHomemadeArtificialDatasets
 
Lecture22
Lecture22Lecture22
Lecture22
 
Lecture21
Lecture21Lecture21
Lecture21
 
Lecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART IIILecture16 - Advances topics on association rules PART III
Lecture16 - Advances topics on association rules PART III
 
Lecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rulesLecture14 - Advanced topics in association rules
Lecture14 - Advanced topics in association rules
 
Lecture13 - Association Rules
Lecture13 - Association RulesLecture13 - Association Rules
Lecture13 - Association Rules
 
Lecture12 - SVM
Lecture12 - SVMLecture12 - SVM
Lecture12 - SVM
 
Lecture10 - Naïve Bayes
Lecture10 - Naïve BayesLecture10 - Naïve Bayes
Lecture10 - Naïve Bayes
 
Lecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-TheoryLecture9 - Bayesian-Decision-Theory
Lecture9 - Bayesian-Decision-Theory
 
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
HIS'2008: Artificial Data Sets based on Knowledge Generators: Analysis of Lea...
 

Dernier

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 

Dernier (20)

BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 

HIS'2008: Genetic-based Synthetic Data Sets for the Analysis of Classifiers Behavior

  • 1. Genetic-based Synthetic Data Sets for the A l i f S t f th Analysis of Classifiers Behavior 8th I t International Conference on Hybrid Intelligent Systems ti lC f H b id I t lli tS t Núria Macià Albert Orriols-Puig Alb t O i l P i Ester Bernadó-Mansilla {nmacia,aorriols,esterb}@salle.url.edu Grup de Recerca en Sistemes Intel·ligents Enginyeria i Arquitectura La Salle Universitat Ramon Llull
  • 2. Motivation Knowledge Data Set Model Extraction Real-world Learner problem + Prediction Necessity of synthetic data sets To evaluate real learners performance under controlled scenarios How to generate synthetic data sets? Data complexity (Ho & Basu, 2002) Length of the class boundary (Macià et al., 2008) Objective: Set of benchmark problems to analyze learners behavior Overview and Future Research Slide 2
  • 3. Outline 1. 1 Data complexity 2. Synthetic data sets 3. Design of GA 4. 4 Experiments and results 5. Conclusions and further work Overview and Future Research Slide 3
  • 4. 1. Data complexity Length of the class boundary Build minimum spanning tree (MST) connecting all the points regardless of class Count the number of edges joining opposite classes it l Two cases of many points in boundary: Very interleaved or random data Linearly separable problem with narrow margins Overview and Future Research Slide 4
  • 5. 2. Synthetic data sets Generation procedure Set the number of instances n, the number of attributes m and the length of the class boundary m, b. Generate n points di t ib t d randomly and b ild G t i t distributed dl d build the MST. Label the class of each instances Overview and Future Research Slide 5
  • 6. 2. Synthetic data sets Exhaustive search Labelings grow exponentially with the number of instances Heuristic search Demanded length of the class boundary is not always achieved No diverse solutions Genetic algorithm G ti l ith Overview and Future Research Slide 6
  • 7. 3. Design of GA Knowledge representation k-ary string where the bit i stores the class label of the ith instance Data set i Individual i Att. 1 Att. 2 … Att. N Class 0.4 04 0.5 05 0.4 04 0 0.2 1.0 0.2 1 011011 0.5 0.3 0.4 1 0.6 0.5 0.4 0 0.7 0.1 1.0 1 0.5 0.3 0.9 1 Overview and Future Research Slide 7
  • 8. 3. Design of GA Genetic operators s-wise tournament selection Two-point crossover T it Bit-wise mutation Fitness function fitnessi = bobj − bi Overview and Future Research Slide 8
  • 9. 4. Experiment and results (I) Synthetic data set generation Different solutions < Solutions Population converge Pop lation con erge to the same sol tion solution {0100,1011} are equivalent individuals Intermediate complexity are obtained i early It di t l it bt i d in l generations Overview and Future Research Slide 9
  • 10. 4. Experiment and results (II) Analysis of classifiers behavior Three different paradigms: C4.5, Naïve Bayes, and SMO Similar accuracy rates with noticeable variability Overview and Future Research Slide 10
  • 11. 5. Conclusions The GA allows us to generate data sets with the demanded length of the class boundary Overview and Future Research Slide 11
  • 12. 6. Further work Efficiency and scalability Move from simple GA to competent GA Capacity of satisfying multiple criteria C f f Multi-objective strategy j gy Achieve structure of real-world problems Provide a set of benchmark problems Overview and Future Research Slide 12
  • 13. Genetic-based Synthetic Data Sets for the A l i f S t f th Analysis of Classifiers Behavior 8th I t International Conference on Hybrid Intelligent Systems ti lC f H b id I t lli tS t Núria Macià Albert Orriols-Puig Alb t O i l P i Ester Bernadó-Mansilla {nmacia,aorriols,esterb}@salle.url.edu Grup de Recerca en Sistemes Intel·ligents Enginyeria i Arquitectura La Salle Universitat Ramon Llull