SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
The NCSA/IlliGAL Gathering on LCS/GBML




SCI2S Research Group                                    University of Granada, Spain
  http://sci2s.ugr.es



     Scalability in GBML, Accuracy-Based Michigan
               Fuzzy LCS, and New Trends

                                Jorge Casillas
                Dept. Computer Science and Artificial Intelligence
                         University of Granada, SPAIN
                         http://decsai.ugr.es/~casillas

                            Urbana, IL, May 16, 2006
Outline



1. Our Research Group (SCI2S, University of Granada, Spain)

2. Genetic Learning and Scaling Up

3. Fuzzy-XCS: An Accuracy-Based Michigan-Style Genetic
   Fuzzy System

4. Advances Toward New Methods and Problems
SCI2S Research Group

    Research Group: “Soft Computing and Intelligent Information Systems”
    Website (members, research lines, publications, projects, software, etc.):
                               http://sci2s.ugr.es
    Main research lines:
       KDD and Data Mining with Evolutionary Algorithms
       Fuzzy Rule-Based Systems and Genetic Fuzzy Systems
       Genetic Algorithms and other Evolutionary Algorithms
       Bioinformatics (http://www.m4m.es)
       Intelligent Information Retrieval and Web-access Systems
       Image Registration by Metaheuristics
       Decision Making with Linguistic / Fuzzy Preferences Relations

2      Jorge Casillas        The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
Outline



1. Our Research Group (SCI2S, University of Granada, Spain)

2. Genetic Learning and Scaling Up

3. Fuzzy-XCS: An Accuracy-Based Michigan-Style Genetic
   Fuzzy System

4. Advances Toward New Methods and Problems
2.1. Motivation and Objectives

    How do learning algorithms behave when the data set size increases?
       The extraction of rule-based models in large data sets by means of classical
       algorithms produces acceptable predictive models but the interpretability is
       reduced


    C4.5 in KDD Cup’99 (494,022 instances, 41 attributes, 23 classes) data
    set produces models with ~99.95% of accuracy but with at least 102
    rules and 10.52 antecedents per rule in average
                                               Accuracy         Size      Ant
                          C4.5 Min               99.96           252      13.34
                          C4.5                   99.95           143      11.78
                          C4.5 Max               99.95           102      10.52

    C4.5 is a fast learning algorithm
    Large data set     good model but low interpretability
4      Jorge Casillas            The NCSA/IlliGAL Gathering on LCS/GBML           Urbana, IL, May 16, 2006
2.1. Motivation and Objectives

    The Evolutionary Instance Selection reduces the size of the rule-
    based models (for example, decision trees [CHL03])
      [CHL03] J.R. Cano, F. Herrera, M. Lozano. Using Evolutionary
      Algorithms as Instance Selection for Data Reduction in KDD: An
      Experimental Study. IEEE Transactions on Evolutionary Computation
      7:6 (2003) 561-575


    The combination of Stratification and Evolutionary Instance
    Selection addresses the Scaling Problem which appear in the
    evaluation of Large Size data sets
      J.R. Cano, F. Herrera, M. Lozano, Stratification for Scaling Up
      Evolutionary Prototype Selection. Pattern Recognition Letters 26
      (2005) 953-963



5      Jorge Casillas         The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
2.1. Motivation and Objectives

    • Proposal: Combination of Evolutionary Training Set Selection with
      Stratified Strategy to extract rule sets with adequate balance
      between interpretability and precision in large data sets

    • Objective: Balance between Prediction and Interpretability

    • Quality measures of the rule sets:

         • Test Accuracy                                                 Prediction
         • Number of Rules
                                                                         Interpretability
         • Number of Antecedents




6        Jorge Casillas         The NCSA/IlliGAL Gathering on LCS/GBML          Urbana, IL, May 16, 2006
2.2. Proposal

                       Training Set Selection Process

                                        Data Set (D)


                       Training Set (TR)                      Test Set (TS)


                     Instance Selection
                          Algorithm

               Training Set Selected (TSS)


                       Data Mining                                    Decision
                     Algorithm (C4.5)                                  Tree



7   Jorge Casillas           The NCSA/IlliGAL Gathering on LCS/GBML              Urbana, IL, May 16, 2006
2.2. Proposal
           Stratified Instance Selection for Training Set Selection
                                                             Data Set (D)


                                 D1     D2      D3                                           Dt
    PSA: Prototype            PSA     PSA    PSA                                        PSA
    Selection Algorithm
                                DS1    DS2     DS3                                          DSt



                                                                                Test Set (TSi)


                             Stratified Training Set Select. (STSSi)


                                        Data Mining                               Decision
                                      Algorithm (C4.5)                             Tree

8           Jorge Casillas             The NCSA/IlliGAL Gathering on LCS/GBML       Urbana, IL, May 16, 2006
2.2. Proposal

    Evolutionary Prototype Selection:
    • Representation
                           TR Set         1     0    1        0       0       1   0   0
                                         1      2     3       4       5       6   7   8


                           TSS Set
                                                          1       3       6
                           Selected

    • Fitness Function

      Fitness (TSS ) = α ⋅ clasper (TSS ) + (1 − α ) ⋅ percred (TSS )

                                                          TR − TSS
                        percred (TSS ) = 100 ⋅
                                                                  TR
9      Jorge Casillas         The NCSA/IlliGAL Gathering on LCS/GBML                      Urbana, IL, May 16, 2006
2.3. Experiments

     Experimental Methodology:
              • Data Set:
                                        Instances       Attributes          Classes
                          Kdd Cup’99    494,022             41                23


              • Prototype Selection Algorithms and Parameters:

                                                               Parameters
                          Cnn
                          Ib2
                          Ib3                        Acceptance=0.9 , Drop=0.7
                         EA-CHC              Population=50, Evaluation=10000, α=0.5
                          C4.5             Minimal Prune, Default Prune, Maximal Prune


              • Number of strata: t=100. (4,940 instances per strata)

10      Jorge Casillas             The NCSA/IlliGAL Gathering on LCS/GBML             Urbana, IL, May 16, 2006
2.3. Experiments


                      % Red     Input Data           Accuracy     Size         Ant
                                Set size for
                                   C4.5
     C4.5 Min                     444,620               99.96     252         13.34

     C4.5                         444,620               99.95     143         11.78

     C4.5 Max                     444,620               99.95     102         10.52

     Cnn st 100       81.61        90,850               96.43     83          11.49

     Ib2 st 100       82.01        88,874               95.05     58          10.86

     Ib3 st 100       78.82       104,634               96.77     74          11.48
     EA-CHC st 100    99.28         3,557               98.41      9          3.56




11   Jorge Casillas      The NCSA/IlliGAL Gathering on LCS/GBML          Urbana, IL, May 16, 2006
2.4. Conclusions


 • Precision: The Evolutionary Stratified Instance Selection shows the
   best accuracy rates among the instance selection algorithms
 • Interpretability: The Evolutionary Stratified Instance Selection
   produces the smallest set of rules with the minimal number of rules
   and antecedents
 • Precision-Interpretability: The Evolutionary Stratified Instance
   Selection offers high balance between precision and interpretability

     J.R. Cano, F. Herrera, M. Lozano, Evolutionary Stratified Training Set
       Selection for Extracting Classification Rules with Trade-off Precision-
       Interpretability. Data and Knowledge Engineering, in press (2006)



12      Jorge Casillas      The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
2.4. Conclusions

      How to apply genetic learning to large data sets?

     Problems:
       Chromosome size
       Data set evaluation


     Solutions:
       Stratification?
       Partial Evaluation?
       Parallel Learning?
       …



13      Jorge Casillas       The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
Outline



1. Our Research Group (SCI2S, University of Granada, Spain)

2. Genetic Learning and Scaling Up

3. Fuzzy-XCS: An Accuracy-Based Michigan-Style Genetic
   Fuzzy System

4. Advances Toward New Methods and Problems
3.1. Objectives

     Fuzzy classifier system (Michigan style): it is a LCS composed by fuzzy
     rules

     There are not many proposals of fuzzy LCS. A list almost exhaustive is
     the following:
     1. [M. Valenzuela-Rendón, 1991]
     2. [M. Valenzuela-Rendón, 1998]
     3. [A. Parodi, P. Bonelli, 1993]
     4. [T. Furuhashi, K. Nakaoka, Y. Uchikawa, 1994]
     5. [K. Nakaoka, T. Furuhashi, Y. Uchikawa, 1994]
     6. [J.R. Velasco, 1998]
     7. [H. Ishibuchi, T. Nakashima, T. Murata, 1999]
     8. [A. Bonarini, 1996]
     9. [A. Bonarini, V. Trianni, 2001]
                                          Except [10], all of them are based on
     10. [D. Gu, H. Hu, 2004]
                                          strength. In [10], the output is discrete and
     11. [M.C. Su, et al., 2005]
                                          generality is not considered
     12.[C.-F. Juang, 2005]

15      Jorge Casillas       The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
3.1. Objectives

     Objective: an accuracy-based fuzzy classifier system
     This kind of system would have some important advantages:
        The use of fuzzy rules allow us to describe in a very legible way state-
        action relations and to deal with uncertainty
        The proposal returns continuous output by using linguistic variables
        also in the consequent
        It tries to obtain optimal generalization to improve compacity of the
        knowledge representation. This involves to avoid overgeneral rules
        It tries to obtain complete covering map

     J. Casillas, B. Carse, L. Bull, Reinforcement learning by an accuracy-based
     fuzzy classifier system with real-valued output. Proc. I International
     Workshop on Genetic Fuzzy Systems, 2005, 44-50

16      Jorge Casillas      The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
3.2. Difficulties


     To develop an accuracy-based fuzzy classifier system has the
     following difficulties:
       Since several rules fire in parallel, credit assignment is much more
       difficult

       The payoff a fuzzy rule receives depends on the input vector, an active
       fuzzy rule will receive different payoffs for different inputs

       Measuring the accuracy of a rule's predicted payoff is difficult since a
       fuzzy rule will fire with many different other fuzzy rules at different
       time-steps, giving very different payoffs




17      Jorge Casillas     The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
3.3. Competitive Fuzzy Inference

     These problems are in part due to the interpolative reasoning performed
     in fuzzy systems where the final output results from aggregate the
     individual contribution of a set of rules

     However, LCS does not consider the interaction among rules as a
     cooperative action but rather each rule competes with the rest to be the
     best for a specific input vector

     To perform a “competitive” inference we only need to change the roles of
     the fuzzy operators
       ALSO operator → Intersection (T-norm): Minimum
       THEN operator → logical causality (S-implications)

           Kleene-Dienes: μ R ( x, y ) = max{1 − μ A ( x), μ B ( y )}

           Łukasiewicz: μ R ( x, y ) = min{1, 1 − μ A ( x) + μ B ( y )}

18      Jorge Casillas           The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
3.4. Fuzzy-XCS
          (3.15, 1.8)                                       Environment
                                              S         M       L
                                                                                                                 8.6
          Detectors                   1
                                                                                                             Effectors
Population        matching
                                                                                                         competitive
                                          0                         10         Candidate               fuzzy inference
             p     ε    F       exp               1.8
                                                        3.15
                                                                             Subsets [CS]
(S,M)⇒L      43   .01   99       2                                                                                           reward
(L,M)⇒S      32   .13       9   15                      (S,M)⇒L          43 .01 99   2
                                                                                                  Action Set [AS]
(S,L)⇒M      14   .05   52      10                      (SM,S)⇒M 18 .02 92           5
(L,L)⇒L      27   .24       3    1                                                                  (S,M) ⇒L        43 .01 99 2
                                                        (SM,S)⇒M 18 .02 92           5              (SM,S) ⇒M       18 .02 92 5
(SM,S)⇒M     18   .02   92       5
                                                        (*,M)⇒M          24 .17 15 23
(*,M)⇒L      24   .17   15      23
    ...
                                      exploration /
                                      exploitation               maximum                                            +        delay
                                                               weighted mean                discount

Match Set [MS]                                                                                credit
                                                                                           distribution
 (S,M)⇒L     43 .01     99       2
 (SM,S)⇒M 18 .02        92       5                  Parameter updates:
                                                   error (ε), prediction (p),
 (*,M)⇒M     24 .17     15      23                   fitness (F), and exp                                       Previous Action Set

                                    EA = selection +                                 yes       Apply
                                  crossover + mutation                                         EA?
3.4. Fuzzy-XCS
                             3.4.1. Generalization representation

     The disjunctive normal form (DNF) is considered:
                        ~                      ~
              IF X 1 is A1 and … and X n is An THEN Y1 is B1 and … and Ym is Bm
                     {         }
              Ai = Ai1 ∨ … ∨ Aili , ∨ = min{ , a + b}
              ~
                                            1


     Binary coding for the antecedent (allele ‘1’ means that the corresponding
     label is used) and integer coding in the consequent (each gene contains
     the index of the label used in the corresponding output variable)

                IF X 1 is P and X 2 is {M ∨ G} THEN Y1 is M and Y2 is G



                                        [100 | 011 | 23]

20      Jorge Casillas         The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
3.4. Fuzzy-XCS
                         3.4.2. Performance component

 1. Match set composition [M]: a matching threshold is used to reduce the
    number of fired rules
 2. Computation of candidate subsets [CS]:
      Equivalent to the prediction array computation in XCS. XCS partitions
      [M] into a number of mutually exclusive sets according to the actions
      In Fuzzy-XCS, several “linguistic actions” (consequents) could/should
      be considered together
      In our case, different groups of consistent and non-redundant fuzzy
      rules with the maximum number of rules in each group are formed
      We perform an exploration/exploitation scheme with probability 0.5.
      On each exploitation step, only those fuzzy classifiers sufficiently
      experienced are considered. On exploration step, the whole match set
      is considered
 3. Action set selection [AS]: It chooses the consistent and non-redundant
    classifier subset with the highest mean prediction
21     Jorge Casillas    The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
3.4. Fuzzy-XCS
                          3.4.3. Reinforcement component


     The prediction error (εj), prediction (pj), and fitness (Fj) values of
     each fuzzy classifier Cj are adjusted by the reinforcement learning
     standard techniques used in XCS: Widrow-Hoff and MAM (modified
     adaptive method)

     However, an important difference is considered in Fuzzy-XCS: the
     credit distribution among the classifiers must be made
     proportionally to the degree of contribution of each classifier to the
     obtained output

     Therefore, firstly a weight is computed for each classifier
     (according to the proposed fuzzy inference) and then parameters
     are adjusted

     Fuzzy-XCS acts on the action set

22      Jorge Casillas    The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
3.4. Fuzzy-XCS
                           3.4.2. Reinforcement component (Credit Distribution)

     Let be the scaled output fuzzy set generated by the fuzzy rule Rj:

                                                B′j = I ( μ AR j ( x), B j )

     Let R1 be the winner rule, with the highest matching degree μA (x)                                Rj


     The process involves analyzing the area that the rival fuzzy rules bite into the area
     generated by the winner rule R1:

                                                      ∫ ∩ μ ( y) dy
                                                               | AS |
                                                                   j =1    B′j
                                               w1   =
                                                        ∫ μ ( y) dy    ′
                                                                      B1


     The remaining weight is distributed among the rest of rules according to the area
     that each of them removes to the winner rule R1:


                               = (1 − w ) ⋅
                                             ∫ μ ( y) dy − ∫ μ ( y) ∧ μ ( y) dy
                                                          ′
                                                         B1                       ′
                                                                                 B1        B′j


                                            ∑ (∫ μ ( y) dy − ∫ μ ( y) ∧ μ ( y) dy )
                          wj          1         | AS |
                                                i=2            ′
                                                              B1                       ′
                                                                                      B1         Bi′

23       Jorge Casillas                   The NCSA/IlliGAL Gathering on LCS/GBML                            Urbana, IL, May 16, 2006
3.4. Fuzzy-XCS
                            3.4.2. Reinforcement component (Adjustment)

     Firstly the P (payoff) value is computed:                 P = r + γ ⋅ max ∑ w ij p ij
                                                                              i∈CS
                                                                                     j


     Then, the following adjustment is performed for each fuzzy classifier belonging to
     the action set using Widrow-Hoff:
     1. Adjust prediction error values (with MAM):

                                    ε j ← ε j + β ⋅ w j ⋅ AS ⋅ (| P − p j | −ε j )

     2. Adjust prediction values (with MAM):

                                        p j ← p j + β ⋅ w j ⋅ AS ⋅ ( P − p j )

     3. Adjust fitness values:
                                                             kj        ⎧(ε j / ε 0 ) −ν , ε j > ε 0
                 F j ← F j + β ⋅ ( k ′j − F j )     k ′j =      , kj = ⎨
                                                           ∑ ki
                                                           Ri ∈ AS
                                                                       ⎩1,                otherwise


24      Jorge Casillas               The NCSA/IlliGAL Gathering on LCS/GBML              Urbana, IL, May 16, 2006
3.4. Fuzzy-XCS
                               3.4.4. Discovery component


     Standard two-point crossover operator:
       Only acts on the antecedent
       Prediction, prediction error, and fitness of offspring are initializated to
       the mean values of the parents



                    [10 001 1 | 23]                              [10 101 1 | 23]

                    [10 101 0 | 12]                             [10 001 0 | 12]


25      Jorge Casillas       The NCSA/IlliGAL Gathering on LCS/GBML       Urbana, IL, May 16, 2006
3.4. Fuzzy-XCS
                               3.4.4. Discovery component


     Mutation:
       If the gene to be mutated corresponds to an input variable:
           Expansion
                            [100 | 011 | 23] → [101 | 011 | 23]

           Contraction
                            [100 | 011 | 23] → [100 | 010 | 23]

           Shift            [100 | 011 | 23] → [010 | 011 | 23]

       If the gene to be mutated corresponds to an output variable:
           The index of the label is increased o decreased by 1

                           [100 | 011 | 23] → [100 | 011 | 22]
26      Jorge Casillas       The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
3.5. Experimental Results
                                        3.5.1. Laboratory Problem: Specification

     First experiment in a laboratory problem:
        2 inputs and 1 output
        5 linguistic terms for each variable (triangular-shape fuzzy sets)
        5 fuzzy rules of different generality degree
        576 examples uniformly distributed in the input space (24 x 24)
        The output value for each input is the result of the inference with the fixed
        fuzzy system
                                                                                                        X1

                       X1                    X2                   Y                        VS       S   M      L    VL
              VS   S   M    L   VL VS    S   M    L   VL VS   S   M   L   VL
                                                                                      VS                      M
         R1   X    X                X    X                X
                                                                                            VS
         R2   X    X                         X    X   X       X                       S                        L
         R3            X    X   X   X                             X             X2    M
         R4            X    X   X        X                            X
                                                                                      L         S             VL
         R5            X    X   X            X    X   X                   X
                                                                                      VL


27      Jorge Casillas                       The NCSA/IlliGAL Gathering on LCS/GBML                 Urbana, IL, May 16, 2006
3.5. Experimental Results
                                       3.5.1. Laboratory Problem: Specification

     The problem is actually a function approximation (with real inputs and
     outputs) where we know the optimal (regarding state/action map and
     generality) solution

                               Y

                          1

                         0.8

                         0.6

                         0.4

                         0.2

                          0
                          1
                               0.8
                                     0.6
                          X2               0.4                                                     1
                                                                                           0.8
                                                 0.2                                 0.6
                                                                            0.4
                                                       0 0        0.2                X1


28      Jorge Casillas                           The NCSA/IlliGAL Gathering on LCS/GBML          Urbana, IL, May 16, 2006
3.5. Experimental Results
                                                         3.5.1. Laboratory Problem: Results

                                                       11000|11000||1             10000|00111||3              00111|00111||5
 Competitive                                           00111|11000||2             01000|00111||4
 distribution and
 inference
                                            0.2
   Implication:
                      Relative Numerosity

 Łukasiewicz
   Aggregation:
 minimum                                    0.1




                                              0

                                            0.2
                      Average Error




                                            0.1


                                              0

                                                  0          5000         10000         15000         20000         25000         30000
                                                                                     Exploit Trials

29       Jorge Casillas                                    The NCSA/IlliGAL Gathering on LCS/GBML                 Urbana, IL, May 16, 2006
3.5. Experimental Results
                                                     3.5.1. Laboratory Problem: Results

                                                   11000|11000||1            10000|00111||3               00111|00111||5
                                                   00111|11000||2            01000|00111||4


                      Relative Numerosity   0.2




                                            0.1




                                              0
 Cooperative
 distribution and                           0.2
 inference
                      Average Error




   Implication:                             0.1
 minimum
   Aggregation:
                                              0
 maximum
                                              0         5000         10000         15000          20000         25000         30000
                                                                                 Exploit Trials

30       Jorge Casillas                                The NCSA/IlliGAL Gathering on LCS/GBML                 Urbana, IL, May 16, 2006
3.5. Experimental Results
                         3.5.1. Laboratory Problem: Results




                                 Fuzzy-XCS             Fuzzy-XCS     Pittsburgh
                               (competitive)         (cooperative)       GFS
                         R1          0.9                    0.0         0.1
                         R2          1.0                    0.1         0.2
                         R3          1.0                    0.0         0.0
                         R4          1.0                    0.0         0.2
                         R5          1.0                    0.0         0.1
            suboptimal rules         0.8                    1.1         7.6
        non-suboptimal rules         1.1                    0.0         2.0

                        MSE      0.000302                  ―         0.001892
           analyzed examples      60,000                 60,000      4,212,864




31   Jorge Casillas       The NCSA/IlliGAL Gathering on LCS/GBML       Urbana, IL, May 16, 2006
3.5. Experimental Results
                         3.5.2. Real-World Mobile Robot Problem: Specification



     Second experiment in a real-world problem
     It consists on an on-line learning of the wall-following behavior for mobile
     robots (Nomad 200 model)

     Input variables (4): relative right-hand distance, right-left hand distance
     coefficient, orientation, and linear velocity
     Output variables (2): linear velocity and angular velocity
     Variables are computed using uniquely the sensors of the robot, so it is
     more realistic

     Reward:
                                                      ⎛ RD − 1                   θ ⎞
                          r ( RD , LV ,θ wall ) = 1 − ⎜ α1     + α 2 LV − 1 + α 3 wall ⎟
                                                      ⎝    3                      45 ⎠


32      Jorge Casillas              The NCSA/IlliGAL Gathering on LCS/GBML         Urbana, IL, May 16, 2006
3.5. Experimental Results
                      3.5.2. Real-World Mobile Robot Problem: Results




33   Jorge Casillas           The NCSA/IlliGAL Gathering on LCS/GBML    Urbana, IL, May 16, 2006
3.5. Experimental Results
                      3.5.2. Real-World Mobile Robot Problem: Results




34   Jorge Casillas           The NCSA/IlliGAL Gathering on LCS/GBML    Urbana, IL, May 16, 2006
3.6. Conclusions


     A fuzzy classifier system for real-valued output that tries to
     generates the complete covering map with optimal generalization is
     proposed

     It is the first algorithm with such characteristics (at least as far as
     we known)



     Current work involves investigating the behavior of the proposal in
     well-know continuous (state and action) reinforcement learning
     problems



35      Jorge Casillas     The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
Outline



1. Our Research Group (SCI2S, University of Granada, Spain)

2. Genetic Learning and Scaling Up

3. Fuzzy-XCS: An Accuracy-Based Michigan-Style Genetic
   Fuzzy System

4. Advances Toward New Methods and Problems
Some Ideas on Advances in LCS/GBML

                Fishing Ideas from Machine Learning…

     Subgroup Discovery
       It is a form of supervised inductive learning which is defined as
       follows: given a population of individuals and a specific property of
       individuals we are interested in, find population subgroups that are
       statistically “most interesting”, e.g., are as large as possible and have
       the most unusual distributional characteristics with respect to the
       property of interest


     References: http://sci2s.ugr.es/keel/specific.php?area=41



37      Jorge Casillas     The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
Some Ideas on Advances in LCS/GBML

                Fishing Ideas from Machine Learning…

     Learning from Unbalanced Data Sets
       In the classification problem field, we often encounter the presence of
       classes with a very different percentage of patterns between them:
       classes with a high pattern percentage and classes with a low pattern
       percentage. These problems receive the name of “classification
       problems with unbalanced classes” and recently they are receiving a
       high level of attention in machine learning


     References: http://sci2s.ugr.es/keel/specific.php?area=43



38      Jorge Casillas     The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
Some Ideas on Advances in LCS/GBML

     Taking Advantages from Evolutionary Algorithms…
     GAs are very flexible to deal with mixed coding schemes,
     combinatorial and continuous optimization, …

     New Model Representations
       With fuzzy rules: disjunctive normal form, weighted rules, linguistic
       modifiers, hierarchical models, …
                        J. Casillas, O. Cordón, F. Herrera, L. Magdalena (Eds.)
                          Interpretability issues in fuzzy modeling.
                          Springer, 2003. ISBN 3-540-02932-X

                              J. Casillas, O. Cordón, F. Herrera, L. Magdalena (Eds.)
                                Accuracy improvements in linguistic fuzzy modeling.
                                Springer, 2003. ISBN 3-540-02933-8

39     Jorge Casillas           The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
Some Ideas on Advances in LCS/GBML

     Taking Advantages from Evolutionary Algorithms…

     Multiobjective EAs: one of the most promising issues and one
     of the main distinguishing marks in evolutionary computation

     Evolutionary Multiobjective Learning
       Supervised Learning: to use multiobjective EAs to obtain a set of
       solutions with different degrees of accuracy and interpretability,
       support and confidence, etc.
       Reinforcement Learning: to consider several rewards and integrate
       multiobjective selection/replacement strategies to deal with that

     References: http://sci2s.ugr.es/keel/specific.php?area=44


40      Jorge Casillas   The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
Some Ideas on Advances in LCS/GBML

            Facing Up to Open (Awkward) Problems…

     Efficient learning with high dimensional data sets

     Solutions to: noisy data, sparse data, incomplete data, vague
     data, …

     Dealing with realistic robotics/control problems: variables
     captured from actual sensors, real-valued actions, …




41      Jorge Casillas   The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006
KEEL Project

     We are working in a national research project to:
       Research on most of the topics previously mentioned, among others
       Develop a software for KDD by evolutionary algorithms (preprocessing,
       learning classifier systems, genetic fuzzy systems, evolutionary neural
       networks, statistical tests, experimental setup, …)


     KEEL: Knowledge Extraction by Evolutionary Learning
       5 Spanish research groups
       About 50 researchers


     Website: http://www.keel.es



42      Jorge Casillas     The NCSA/IlliGAL Gathering on LCS/GBML   Urbana, IL, May 16, 2006

Contenu connexe

Similaire à Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends

Data Science, Big Data and You
Data Science, Big Data and YouData Science, Big Data and You
Data Science, Big Data and YouJoel Saltz
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaginggeetachauhan
 
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...Salford Systems
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Natalio Krasnogor
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsNatalio Krasnogor
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Databricks
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionValery Tkachenko
 
[2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger [2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger Eli Kaminuma
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Enrico Glaab
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision treesJulià Minguillón
 
Computational approaches to the regulatory genomics of neurogenesis
Computational approaches to the regulatory genomics of neurogenesisComputational approaches to the regulatory genomics of neurogenesis
Computational approaches to the regulatory genomics of neurogenesisIan Simpson
 
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection System
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection SystemThe Use of K-NN and Bees Algorithm for Big Data Intrusion Detection System
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection SystemIOSRjournaljce
 
Decision Tree Based Algorithm for Intrusion Detection
Decision Tree Based Algorithm for Intrusion DetectionDecision Tree Based Algorithm for Intrusion Detection
Decision Tree Based Algorithm for Intrusion DetectionEswar Publications
 
Howard University: Center for Computational Biology and Bioinformatics
Howard University: Center for Computational Biology and BioinformaticsHoward University: Center for Computational Biology and Bioinformatics
Howard University: Center for Computational Biology and Bioinformaticskarl.barnes
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSLynn Langit
 
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCERKNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCERcscpconf
 

Similaire à Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends (20)

Data Science, Big Data and You
Data Science, Big Data and YouData Science, Big Data and You
Data Science, Big Data and You
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaging
 
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
A Hybrid Method of CART and Artificial Neural Network for Short Term Load For...
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
 
Conference slide
Conference slideConference slide
Conference slide
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric Bioinformatics
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collection
 
[2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger [2017-05-29] DNASmartTagger
[2017-05-29] DNASmartTagger
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
 
Prediction of pKa from chemical structure using free and open source tools
Prediction of pKa from chemical structure using free and open source toolsPrediction of pKa from chemical structure using free and open source tools
Prediction of pKa from chemical structure using free and open source tools
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision trees
 
Computational approaches to the regulatory genomics of neurogenesis
Computational approaches to the regulatory genomics of neurogenesisComputational approaches to the regulatory genomics of neurogenesis
Computational approaches to the regulatory genomics of neurogenesis
 
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection System
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection SystemThe Use of K-NN and Bees Algorithm for Big Data Intrusion Detection System
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection System
 
Decision Tree Based Algorithm for Intrusion Detection
Decision Tree Based Algorithm for Intrusion DetectionDecision Tree Based Algorithm for Intrusion Detection
Decision Tree Based Algorithm for Intrusion Detection
 
Howard University: Center for Computational Biology and Bioinformatics
Howard University: Center for Computational Biology and BioinformaticsHoward University: Center for Computational Biology and Bioinformatics
Howard University: Center for Computational Biology and Bioinformatics
 
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
[IJET-V2I3P21] Authors: Amit Kumar Dewangan, Akhilesh Kumar Shrivas, Prem Kumar
 
Bioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWSBioinformatics Data Pipelines built by CSIRO on AWS
Bioinformatics Data Pipelines built by CSIRO on AWS
 
Folker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data AnnotationFolker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data Annotation
 
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCERKNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER
 

Plus de Xavier Llorà

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewXavier Llorà
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with MeandreXavier Llorà
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0Xavier Llorà
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine LearningXavier Llorà
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...Xavier Llorà
 
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...Xavier Llorà
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...Xavier Llorà
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance ProblemsXavier Llorà
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...Xavier Llorà
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challengesXavier Llorà
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionXavier Llorà
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier SystemsXavier Llorà
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?Xavier Llorà
 
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableLinkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableXavier Llorà
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsXavier Llorà
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring LanguageXavier Llorà
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...Xavier Llorà
 
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Xavier Llorà
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Xavier Llorà
 

Plus de Xavier Llorà (20)

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with Meandre
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
 
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance Problems
 
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...A Retrospective Look at  A Retrospective Look at  Classifier System ResearchCl...
A Retrospective Look at A Retrospective Look at Classifier System ResearchCl...
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challenges
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier Systems
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?
 
NIGEL 2006 welcome
NIGEL 2006 welcomeNIGEL 2006 welcome
NIGEL 2006 welcome
 
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableLinkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring Language
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
 
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
 

Dernier

KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 

Dernier (20)

201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 

Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends

  • 1. The NCSA/IlliGAL Gathering on LCS/GBML SCI2S Research Group University of Granada, Spain http://sci2s.ugr.es Scalability in GBML, Accuracy-Based Michigan Fuzzy LCS, and New Trends Jorge Casillas Dept. Computer Science and Artificial Intelligence University of Granada, SPAIN http://decsai.ugr.es/~casillas Urbana, IL, May 16, 2006
  • 2. Outline 1. Our Research Group (SCI2S, University of Granada, Spain) 2. Genetic Learning and Scaling Up 3. Fuzzy-XCS: An Accuracy-Based Michigan-Style Genetic Fuzzy System 4. Advances Toward New Methods and Problems
  • 3. SCI2S Research Group Research Group: “Soft Computing and Intelligent Information Systems” Website (members, research lines, publications, projects, software, etc.): http://sci2s.ugr.es Main research lines: KDD and Data Mining with Evolutionary Algorithms Fuzzy Rule-Based Systems and Genetic Fuzzy Systems Genetic Algorithms and other Evolutionary Algorithms Bioinformatics (http://www.m4m.es) Intelligent Information Retrieval and Web-access Systems Image Registration by Metaheuristics Decision Making with Linguistic / Fuzzy Preferences Relations 2 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 4. Outline 1. Our Research Group (SCI2S, University of Granada, Spain) 2. Genetic Learning and Scaling Up 3. Fuzzy-XCS: An Accuracy-Based Michigan-Style Genetic Fuzzy System 4. Advances Toward New Methods and Problems
  • 5. 2.1. Motivation and Objectives How do learning algorithms behave when the data set size increases? The extraction of rule-based models in large data sets by means of classical algorithms produces acceptable predictive models but the interpretability is reduced C4.5 in KDD Cup’99 (494,022 instances, 41 attributes, 23 classes) data set produces models with ~99.95% of accuracy but with at least 102 rules and 10.52 antecedents per rule in average Accuracy Size Ant C4.5 Min 99.96 252 13.34 C4.5 99.95 143 11.78 C4.5 Max 99.95 102 10.52 C4.5 is a fast learning algorithm Large data set good model but low interpretability 4 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 6. 2.1. Motivation and Objectives The Evolutionary Instance Selection reduces the size of the rule- based models (for example, decision trees [CHL03]) [CHL03] J.R. Cano, F. Herrera, M. Lozano. Using Evolutionary Algorithms as Instance Selection for Data Reduction in KDD: An Experimental Study. IEEE Transactions on Evolutionary Computation 7:6 (2003) 561-575 The combination of Stratification and Evolutionary Instance Selection addresses the Scaling Problem which appear in the evaluation of Large Size data sets J.R. Cano, F. Herrera, M. Lozano, Stratification for Scaling Up Evolutionary Prototype Selection. Pattern Recognition Letters 26 (2005) 953-963 5 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 7. 2.1. Motivation and Objectives • Proposal: Combination of Evolutionary Training Set Selection with Stratified Strategy to extract rule sets with adequate balance between interpretability and precision in large data sets • Objective: Balance between Prediction and Interpretability • Quality measures of the rule sets: • Test Accuracy Prediction • Number of Rules Interpretability • Number of Antecedents 6 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 8. 2.2. Proposal Training Set Selection Process Data Set (D) Training Set (TR) Test Set (TS) Instance Selection Algorithm Training Set Selected (TSS) Data Mining Decision Algorithm (C4.5) Tree 7 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 9. 2.2. Proposal Stratified Instance Selection for Training Set Selection Data Set (D) D1 D2 D3 Dt PSA: Prototype PSA PSA PSA PSA Selection Algorithm DS1 DS2 DS3 DSt Test Set (TSi) Stratified Training Set Select. (STSSi) Data Mining Decision Algorithm (C4.5) Tree 8 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 10. 2.2. Proposal Evolutionary Prototype Selection: • Representation TR Set 1 0 1 0 0 1 0 0 1 2 3 4 5 6 7 8 TSS Set 1 3 6 Selected • Fitness Function Fitness (TSS ) = α ⋅ clasper (TSS ) + (1 − α ) ⋅ percred (TSS ) TR − TSS percred (TSS ) = 100 ⋅ TR 9 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 11. 2.3. Experiments Experimental Methodology: • Data Set: Instances Attributes Classes Kdd Cup’99 494,022 41 23 • Prototype Selection Algorithms and Parameters: Parameters Cnn Ib2 Ib3 Acceptance=0.9 , Drop=0.7 EA-CHC Population=50, Evaluation=10000, α=0.5 C4.5 Minimal Prune, Default Prune, Maximal Prune • Number of strata: t=100. (4,940 instances per strata) 10 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 12. 2.3. Experiments % Red Input Data Accuracy Size Ant Set size for C4.5 C4.5 Min 444,620 99.96 252 13.34 C4.5 444,620 99.95 143 11.78 C4.5 Max 444,620 99.95 102 10.52 Cnn st 100 81.61 90,850 96.43 83 11.49 Ib2 st 100 82.01 88,874 95.05 58 10.86 Ib3 st 100 78.82 104,634 96.77 74 11.48 EA-CHC st 100 99.28 3,557 98.41 9 3.56 11 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 13. 2.4. Conclusions • Precision: The Evolutionary Stratified Instance Selection shows the best accuracy rates among the instance selection algorithms • Interpretability: The Evolutionary Stratified Instance Selection produces the smallest set of rules with the minimal number of rules and antecedents • Precision-Interpretability: The Evolutionary Stratified Instance Selection offers high balance between precision and interpretability J.R. Cano, F. Herrera, M. Lozano, Evolutionary Stratified Training Set Selection for Extracting Classification Rules with Trade-off Precision- Interpretability. Data and Knowledge Engineering, in press (2006) 12 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 14. 2.4. Conclusions How to apply genetic learning to large data sets? Problems: Chromosome size Data set evaluation Solutions: Stratification? Partial Evaluation? Parallel Learning? … 13 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 15. Outline 1. Our Research Group (SCI2S, University of Granada, Spain) 2. Genetic Learning and Scaling Up 3. Fuzzy-XCS: An Accuracy-Based Michigan-Style Genetic Fuzzy System 4. Advances Toward New Methods and Problems
  • 16. 3.1. Objectives Fuzzy classifier system (Michigan style): it is a LCS composed by fuzzy rules There are not many proposals of fuzzy LCS. A list almost exhaustive is the following: 1. [M. Valenzuela-Rendón, 1991] 2. [M. Valenzuela-Rendón, 1998] 3. [A. Parodi, P. Bonelli, 1993] 4. [T. Furuhashi, K. Nakaoka, Y. Uchikawa, 1994] 5. [K. Nakaoka, T. Furuhashi, Y. Uchikawa, 1994] 6. [J.R. Velasco, 1998] 7. [H. Ishibuchi, T. Nakashima, T. Murata, 1999] 8. [A. Bonarini, 1996] 9. [A. Bonarini, V. Trianni, 2001] Except [10], all of them are based on 10. [D. Gu, H. Hu, 2004] strength. In [10], the output is discrete and 11. [M.C. Su, et al., 2005] generality is not considered 12.[C.-F. Juang, 2005] 15 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 17. 3.1. Objectives Objective: an accuracy-based fuzzy classifier system This kind of system would have some important advantages: The use of fuzzy rules allow us to describe in a very legible way state- action relations and to deal with uncertainty The proposal returns continuous output by using linguistic variables also in the consequent It tries to obtain optimal generalization to improve compacity of the knowledge representation. This involves to avoid overgeneral rules It tries to obtain complete covering map J. Casillas, B. Carse, L. Bull, Reinforcement learning by an accuracy-based fuzzy classifier system with real-valued output. Proc. I International Workshop on Genetic Fuzzy Systems, 2005, 44-50 16 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 18. 3.2. Difficulties To develop an accuracy-based fuzzy classifier system has the following difficulties: Since several rules fire in parallel, credit assignment is much more difficult The payoff a fuzzy rule receives depends on the input vector, an active fuzzy rule will receive different payoffs for different inputs Measuring the accuracy of a rule's predicted payoff is difficult since a fuzzy rule will fire with many different other fuzzy rules at different time-steps, giving very different payoffs 17 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 19. 3.3. Competitive Fuzzy Inference These problems are in part due to the interpolative reasoning performed in fuzzy systems where the final output results from aggregate the individual contribution of a set of rules However, LCS does not consider the interaction among rules as a cooperative action but rather each rule competes with the rest to be the best for a specific input vector To perform a “competitive” inference we only need to change the roles of the fuzzy operators ALSO operator → Intersection (T-norm): Minimum THEN operator → logical causality (S-implications) Kleene-Dienes: μ R ( x, y ) = max{1 − μ A ( x), μ B ( y )} Łukasiewicz: μ R ( x, y ) = min{1, 1 − μ A ( x) + μ B ( y )} 18 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 20. 3.4. Fuzzy-XCS (3.15, 1.8) Environment S M L 8.6 Detectors 1 Effectors Population matching competitive 0 10 Candidate fuzzy inference p ε F exp 1.8 3.15 Subsets [CS] (S,M)⇒L 43 .01 99 2 reward (L,M)⇒S 32 .13 9 15 (S,M)⇒L 43 .01 99 2 Action Set [AS] (S,L)⇒M 14 .05 52 10 (SM,S)⇒M 18 .02 92 5 (L,L)⇒L 27 .24 3 1 (S,M) ⇒L 43 .01 99 2 (SM,S)⇒M 18 .02 92 5 (SM,S) ⇒M 18 .02 92 5 (SM,S)⇒M 18 .02 92 5 (*,M)⇒M 24 .17 15 23 (*,M)⇒L 24 .17 15 23 ... exploration / exploitation maximum + delay weighted mean discount Match Set [MS] credit distribution (S,M)⇒L 43 .01 99 2 (SM,S)⇒M 18 .02 92 5 Parameter updates: error (ε), prediction (p), (*,M)⇒M 24 .17 15 23 fitness (F), and exp Previous Action Set EA = selection + yes Apply crossover + mutation EA?
  • 21. 3.4. Fuzzy-XCS 3.4.1. Generalization representation The disjunctive normal form (DNF) is considered: ~ ~ IF X 1 is A1 and … and X n is An THEN Y1 is B1 and … and Ym is Bm { } Ai = Ai1 ∨ … ∨ Aili , ∨ = min{ , a + b} ~ 1 Binary coding for the antecedent (allele ‘1’ means that the corresponding label is used) and integer coding in the consequent (each gene contains the index of the label used in the corresponding output variable) IF X 1 is P and X 2 is {M ∨ G} THEN Y1 is M and Y2 is G [100 | 011 | 23] 20 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 22. 3.4. Fuzzy-XCS 3.4.2. Performance component 1. Match set composition [M]: a matching threshold is used to reduce the number of fired rules 2. Computation of candidate subsets [CS]: Equivalent to the prediction array computation in XCS. XCS partitions [M] into a number of mutually exclusive sets according to the actions In Fuzzy-XCS, several “linguistic actions” (consequents) could/should be considered together In our case, different groups of consistent and non-redundant fuzzy rules with the maximum number of rules in each group are formed We perform an exploration/exploitation scheme with probability 0.5. On each exploitation step, only those fuzzy classifiers sufficiently experienced are considered. On exploration step, the whole match set is considered 3. Action set selection [AS]: It chooses the consistent and non-redundant classifier subset with the highest mean prediction 21 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 23. 3.4. Fuzzy-XCS 3.4.3. Reinforcement component The prediction error (εj), prediction (pj), and fitness (Fj) values of each fuzzy classifier Cj are adjusted by the reinforcement learning standard techniques used in XCS: Widrow-Hoff and MAM (modified adaptive method) However, an important difference is considered in Fuzzy-XCS: the credit distribution among the classifiers must be made proportionally to the degree of contribution of each classifier to the obtained output Therefore, firstly a weight is computed for each classifier (according to the proposed fuzzy inference) and then parameters are adjusted Fuzzy-XCS acts on the action set 22 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 24. 3.4. Fuzzy-XCS 3.4.2. Reinforcement component (Credit Distribution) Let be the scaled output fuzzy set generated by the fuzzy rule Rj: B′j = I ( μ AR j ( x), B j ) Let R1 be the winner rule, with the highest matching degree μA (x) Rj The process involves analyzing the area that the rival fuzzy rules bite into the area generated by the winner rule R1: ∫ ∩ μ ( y) dy | AS | j =1 B′j w1 = ∫ μ ( y) dy ′ B1 The remaining weight is distributed among the rest of rules according to the area that each of them removes to the winner rule R1: = (1 − w ) ⋅ ∫ μ ( y) dy − ∫ μ ( y) ∧ μ ( y) dy ′ B1 ′ B1 B′j ∑ (∫ μ ( y) dy − ∫ μ ( y) ∧ μ ( y) dy ) wj 1 | AS | i=2 ′ B1 ′ B1 Bi′ 23 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 25. 3.4. Fuzzy-XCS 3.4.2. Reinforcement component (Adjustment) Firstly the P (payoff) value is computed: P = r + γ ⋅ max ∑ w ij p ij i∈CS j Then, the following adjustment is performed for each fuzzy classifier belonging to the action set using Widrow-Hoff: 1. Adjust prediction error values (with MAM): ε j ← ε j + β ⋅ w j ⋅ AS ⋅ (| P − p j | −ε j ) 2. Adjust prediction values (with MAM): p j ← p j + β ⋅ w j ⋅ AS ⋅ ( P − p j ) 3. Adjust fitness values: kj ⎧(ε j / ε 0 ) −ν , ε j > ε 0 F j ← F j + β ⋅ ( k ′j − F j ) k ′j = , kj = ⎨ ∑ ki Ri ∈ AS ⎩1, otherwise 24 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 26. 3.4. Fuzzy-XCS 3.4.4. Discovery component Standard two-point crossover operator: Only acts on the antecedent Prediction, prediction error, and fitness of offspring are initializated to the mean values of the parents [10 001 1 | 23] [10 101 1 | 23] [10 101 0 | 12] [10 001 0 | 12] 25 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 27. 3.4. Fuzzy-XCS 3.4.4. Discovery component Mutation: If the gene to be mutated corresponds to an input variable: Expansion [100 | 011 | 23] → [101 | 011 | 23] Contraction [100 | 011 | 23] → [100 | 010 | 23] Shift [100 | 011 | 23] → [010 | 011 | 23] If the gene to be mutated corresponds to an output variable: The index of the label is increased o decreased by 1 [100 | 011 | 23] → [100 | 011 | 22] 26 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 28. 3.5. Experimental Results 3.5.1. Laboratory Problem: Specification First experiment in a laboratory problem: 2 inputs and 1 output 5 linguistic terms for each variable (triangular-shape fuzzy sets) 5 fuzzy rules of different generality degree 576 examples uniformly distributed in the input space (24 x 24) The output value for each input is the result of the inference with the fixed fuzzy system X1 X1 X2 Y VS S M L VL VS S M L VL VS S M L VL VS S M L VL VS M R1 X X X X X VS R2 X X X X X X S L R3 X X X X X X2 M R4 X X X X X L S VL R5 X X X X X X X VL 27 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 29. 3.5. Experimental Results 3.5.1. Laboratory Problem: Specification The problem is actually a function approximation (with real inputs and outputs) where we know the optimal (regarding state/action map and generality) solution Y 1 0.8 0.6 0.4 0.2 0 1 0.8 0.6 X2 0.4 1 0.8 0.2 0.6 0.4 0 0 0.2 X1 28 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 30. 3.5. Experimental Results 3.5.1. Laboratory Problem: Results 11000|11000||1 10000|00111||3 00111|00111||5 Competitive 00111|11000||2 01000|00111||4 distribution and inference 0.2 Implication: Relative Numerosity Łukasiewicz Aggregation: minimum 0.1 0 0.2 Average Error 0.1 0 0 5000 10000 15000 20000 25000 30000 Exploit Trials 29 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 31. 3.5. Experimental Results 3.5.1. Laboratory Problem: Results 11000|11000||1 10000|00111||3 00111|00111||5 00111|11000||2 01000|00111||4 Relative Numerosity 0.2 0.1 0 Cooperative distribution and 0.2 inference Average Error Implication: 0.1 minimum Aggregation: 0 maximum 0 5000 10000 15000 20000 25000 30000 Exploit Trials 30 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 32. 3.5. Experimental Results 3.5.1. Laboratory Problem: Results Fuzzy-XCS Fuzzy-XCS Pittsburgh (competitive) (cooperative) GFS R1 0.9 0.0 0.1 R2 1.0 0.1 0.2 R3 1.0 0.0 0.0 R4 1.0 0.0 0.2 R5 1.0 0.0 0.1 suboptimal rules 0.8 1.1 7.6 non-suboptimal rules 1.1 0.0 2.0 MSE 0.000302 ― 0.001892 analyzed examples 60,000 60,000 4,212,864 31 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 33. 3.5. Experimental Results 3.5.2. Real-World Mobile Robot Problem: Specification Second experiment in a real-world problem It consists on an on-line learning of the wall-following behavior for mobile robots (Nomad 200 model) Input variables (4): relative right-hand distance, right-left hand distance coefficient, orientation, and linear velocity Output variables (2): linear velocity and angular velocity Variables are computed using uniquely the sensors of the robot, so it is more realistic Reward: ⎛ RD − 1 θ ⎞ r ( RD , LV ,θ wall ) = 1 − ⎜ α1 + α 2 LV − 1 + α 3 wall ⎟ ⎝ 3 45 ⎠ 32 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 34. 3.5. Experimental Results 3.5.2. Real-World Mobile Robot Problem: Results 33 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 35. 3.5. Experimental Results 3.5.2. Real-World Mobile Robot Problem: Results 34 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 36. 3.6. Conclusions A fuzzy classifier system for real-valued output that tries to generates the complete covering map with optimal generalization is proposed It is the first algorithm with such characteristics (at least as far as we known) Current work involves investigating the behavior of the proposal in well-know continuous (state and action) reinforcement learning problems 35 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 37. Outline 1. Our Research Group (SCI2S, University of Granada, Spain) 2. Genetic Learning and Scaling Up 3. Fuzzy-XCS: An Accuracy-Based Michigan-Style Genetic Fuzzy System 4. Advances Toward New Methods and Problems
  • 38. Some Ideas on Advances in LCS/GBML Fishing Ideas from Machine Learning… Subgroup Discovery It is a form of supervised inductive learning which is defined as follows: given a population of individuals and a specific property of individuals we are interested in, find population subgroups that are statistically “most interesting”, e.g., are as large as possible and have the most unusual distributional characteristics with respect to the property of interest References: http://sci2s.ugr.es/keel/specific.php?area=41 37 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 39. Some Ideas on Advances in LCS/GBML Fishing Ideas from Machine Learning… Learning from Unbalanced Data Sets In the classification problem field, we often encounter the presence of classes with a very different percentage of patterns between them: classes with a high pattern percentage and classes with a low pattern percentage. These problems receive the name of “classification problems with unbalanced classes” and recently they are receiving a high level of attention in machine learning References: http://sci2s.ugr.es/keel/specific.php?area=43 38 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 40. Some Ideas on Advances in LCS/GBML Taking Advantages from Evolutionary Algorithms… GAs are very flexible to deal with mixed coding schemes, combinatorial and continuous optimization, … New Model Representations With fuzzy rules: disjunctive normal form, weighted rules, linguistic modifiers, hierarchical models, … J. Casillas, O. Cordón, F. Herrera, L. Magdalena (Eds.) Interpretability issues in fuzzy modeling. Springer, 2003. ISBN 3-540-02932-X J. Casillas, O. Cordón, F. Herrera, L. Magdalena (Eds.) Accuracy improvements in linguistic fuzzy modeling. Springer, 2003. ISBN 3-540-02933-8 39 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 41. Some Ideas on Advances in LCS/GBML Taking Advantages from Evolutionary Algorithms… Multiobjective EAs: one of the most promising issues and one of the main distinguishing marks in evolutionary computation Evolutionary Multiobjective Learning Supervised Learning: to use multiobjective EAs to obtain a set of solutions with different degrees of accuracy and interpretability, support and confidence, etc. Reinforcement Learning: to consider several rewards and integrate multiobjective selection/replacement strategies to deal with that References: http://sci2s.ugr.es/keel/specific.php?area=44 40 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 42. Some Ideas on Advances in LCS/GBML Facing Up to Open (Awkward) Problems… Efficient learning with high dimensional data sets Solutions to: noisy data, sparse data, incomplete data, vague data, … Dealing with realistic robotics/control problems: variables captured from actual sensors, real-valued actions, … 41 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006
  • 43. KEEL Project We are working in a national research project to: Research on most of the topics previously mentioned, among others Develop a software for KDD by evolutionary algorithms (preprocessing, learning classifier systems, genetic fuzzy systems, evolutionary neural networks, statistical tests, experimental setup, …) KEEL: Knowledge Extraction by Evolutionary Learning 5 Spanish research groups About 50 researchers Website: http://www.keel.es 42 Jorge Casillas The NCSA/IlliGAL Gathering on LCS/GBML Urbana, IL, May 16, 2006