SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
A Retrospective Look at
  Classifier System Research



Lashon B. Booker
The MITRE Corporation




                        © 2006 The MITRE Corporation. All rights reserved.
Early Motivations for Learning Classifier
System (LCS) Research

  Design symbolic problem solvers that avoid brittleness in
    realistic (uncertain and continually varying) domains
    involving
    – On-line, adaptive control of behaviors
       Representations and procedures must adjust without unnecessarily
         disrupting existing capabilities
    – Discovering relevant categories in a complex and unlabeled
      stream of input
       Inputs must be incrementally grouped together into plausible classes

  This is especially difficult when behavior requires more
    knowledge representation and processing capability than
    is available with simple empirical associations between
    inputs and outputs


                                                  © 2006 The MITRE Corporation. All rights reserved.
Requirements for Non-Brittle Rule-Based
Behavior

  Need to identify and take advantage of the exploitable
   regularities in the environment
  Generalizations must be selective, pragmatic and subject
   to exceptions
  Learning must be incremental and closely coupled with
   performance and with unfolding reality
  Rules must be treated as tentative hypotheses (not logical
   assertions) subject to testing and conformation
    – Hypothesis “strength” is derived from experienced-based
      predictions of performance
    – Strength is used to determine rule fitness and infer plausibility



                                                © 2006 The MITRE Corporation. All rights reserved.
Observations about early research
 The Holland and Reitman collaboration placed a strong
   emphasis on cognition and characterized the problems of
   interest
 Viewed classifier systems as symbolic problem solvers that
   avoid brittle behavior (an alternative to expert systems)
   –   Treat rule set as a model and rules as parts in a context
   –   Evaluation of parts is context dependent (i.e., aspects are non-stationary)

 Learning emphasized policy search and value estimation
   –   Rules are policy elements along with performance estimators
   –   Adjust policy via natural selection among rule types
   –   The Pitt approach preserved this idea, using the GA for direct policy search

 Included provisions for motivation, affect and introspection

 These ideas provided the foundation for a comprehensive theory
   of induction (rule clusters, distributed representations,
   associations, spreading activation, etc.)

                                                                   © 2006 The MITRE Corporation. All rights reserved.
Influence of reinforcement learning

                                               Reinforcement learning problems are
                                           
                                               faced by agents that must learn action
                                               sequences from trial-and-error
                                                – Framework provides attractive formalisms
                                                  based on estimating value functions (with
         Environment                              key contributions from Sutton and Barto)
State                                           – Algorithms provide useful benchmarks for
                                                  comparisons
        input
                                               Emphasis on value functions has had
                                           
                 Learning
                                               a strong influence on LCS research
                  Agent
                                 Action         – The primary niche is learning compact
      scalar
                                                  value function representations for off-
    feedback
                                                  policy temporal difference methods
                                                – But, the RL community has good
                                                  alternatives
Solution strategies:
• Search the space of possible behaviors       It is not clear if we are learning the
                                           
                                               best generalizations, or giving
• Estimate utility of taking actions in
                                               sufficient emphasis to policy
world states
                                               improvement © 2006 The MITRE Corporation. All rights reserved.
Value-based generalizations aren’t often intuitive
                                                                                                        Start




                  0
                  0




                                                                                            0
                                                                                                0


                                                                                                        0




                                                                                                                                                                                  0
                                                                                                                                                                                      0
              0


                      50
                      50
                           75
                           75




                                                                                  75
                                                                                       50
                                                                                       50




                                                                                                    0


                                                                                                            50
                                                                                                                 75




                                                                                                                                                                        75
                                                                                                                                                                             50
                                125
                                125
                                      250
                                      250
                                            500
                                            500



                                                                500
                                                                500
                                                                      250
                                                                            125
                                                                            125




                                                                                                                      125
                                                                                                                            250
                                                                                                                                  500



                                                                                                                                                      500
                                                                                                                                                            250
                                                                                                                                                                  125
                                                  1000
                                                  1000
                                                         1000
                                                         1000




                                                                                                                                        1000
                                                                                                                                               1000
                                                           Grefenstette’s 9x32 abstract state space

       There are many obvious intuitive solution strategies:
   
        –   E.g. Move left or right to column with highest reward, then go straight

       Classifier systems tend to learn piecemeal strategies rather than coherent
   
       ones
        –   Many narrowly-focused general rules are needed to get the overall solution
        –   Generalizations correspond to symmetries in the reward distribution
                 e.g., (Row = 111) (Column = #011#)  RIGHT )
            not the key attribute-based concepts.
        –   This distinction has been irrelevant in most classifier system test problems (e.g.,
            multiplexor and Woods problems)
                                                                                                                                                                   © 2006 The MITRE Corporation. All rights reserved.
Off-policy Methods Learn Different
Behaviors

                                    Since Q learning is an off-policy method
                                
                                    (i.e., behavior policy may differ from
                                    estimation policy), it does not suffer
                                    negative consequences for exploration
                                    Sarsa (i.e. the bucket brigade) is an on-
                                
                                    policy method, so its solution accounts
                                    for the consequences of exploration
                                    In real problems where on-line errors
                                
                                    are costly, this distinction is important
                                    This also has architectural implications
                                
                                    (e.g., how to approximate the value
                                    function)



 Bottom line: we need to identify and build on the strengths of the LCS
    approach. The key may be in specifying a set of organizing principles
    that go beyond implementation diagrams
                                                  © 2006 The MITRE Corporation. All rights reserved.
Soar Architecture of Intelligent Rule-based
Behavior

                                                 I/O

                    Low                                              Faster
                Intelligence
                                           Reaction



                                          Deliberation
                               Learning




                                           Reflection
                    High
                                                                    Slower
                Intelligence



  Derived by Newell and his students (~1980), also as a response
    to the expert system phenomenon
  Based on a theory of problem solving (i.e., problem spaces),
    along with a companion view of learning (i.e., chunking)
  The theory was operationalized as an architecture that has
    served that community well

                                                         © 2006 The MITRE Corporation. All rights reserved.
What kind of architecture makes sense for
classifier systems?
     !*
     !
                                       The key role of policy
             policy
                                        improvement suggests that
           evaluation
                                        an actor-critic structure may
            value
           learning
                                        be a good start
                            Critic
                                       The idea is to intermix value
   Actor
                                        iteration and policy
                                        improvement continually
              policy
                                        (state by state, action by
           improvement
                            V, *Q *     action, sample by sample)
           greedification
                             V,Q
                                       Is there an organizing
                                        principle that extends this
                                        concept to cover many forms
                                        of induction at different
                                        scales? (including
                                        perception, reasoning, and
                                        action)

                                              © 2006 The MITRE Corporation. All rights reserved.
DARPA/IPTO Focus on Cognitive Systems




       Darpa views a cognitive system as one that
   
        –   can reason, using substantial amounts of appropriately represented knowledge
        –   can learn from its experience so that it performs better tomorrow than it did today
        –   can explain itself and be told what to do
        –   can be aware of its own capabilities and reflect on its own behavior
        –   can respond robustly to surprise
       Learning is ubiquitous. Different forms operate at different times and
   
       places
       What niche is the LCS community best suited to fill?Corporation. All rights reserved.
                                                © 2006 The MITRE
Some Open Problems for Reinforcement
Learning (Sutton) - and Classifier Systems

    Incomplete state information
    Exploration
    Structured states and actions
    Incorporating prior knowledge
    Using teachers
    Theory of RL with function approximators
    Modular and hierarchical architectures
    Integration with other problem–solving and
     planning methods




                                          © 2006 The MITRE Corporation. All rights reserved.

Contenu connexe

Similaire à A Retrospective Look at A Retrospective Look at Classifier System ResearchClassifier System Research

Dsp black rock us flexible equity fund nfo presentation
Dsp black rock us flexible equity fund   nfo presentationDsp black rock us flexible equity fund   nfo presentation
Dsp black rock us flexible equity fund nfo presentationPrajna Capital
 
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...David Parsons
 
Artefatos para gestão de problemas
Artefatos para gestão de problemasArtefatos para gestão de problemas
Artefatos para gestão de problemasFernando Palma
 
Dsp black rock_us_flexible_equity_fund_nfo_presentation3
Dsp black rock_us_flexible_equity_fund_nfo_presentation3Dsp black rock_us_flexible_equity_fund_nfo_presentation3
Dsp black rock_us_flexible_equity_fund_nfo_presentation3sudhanshuarora1
 
Panel4 carolinarossini
Panel4 carolinarossiniPanel4 carolinarossini
Panel4 carolinarossiniREA Brasil
 
Ethics poll overview for website
Ethics poll overview for websiteEthics poll overview for website
Ethics poll overview for websitePriority Thinking
 
Agile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoopAgile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoopDataWorks Summit
 
Associating the Value of Automation with Project Funding
Associating the Value of Automation with Project FundingAssociating the Value of Automation with Project Funding
Associating the Value of Automation with Project FundingARC Advisory Group
 
Opportunities & Obligations
Opportunities & ObligationsOpportunities & Obligations
Opportunities & ObligationsMartin Rehm
 
Diskontinuierliche Innovation
Diskontinuierliche InnovationDiskontinuierliche Innovation
Diskontinuierliche Innovationintegro
 
Machine Learning for Speech
Machine Learning for Speech Machine Learning for Speech
Machine Learning for Speech butest
 
Relevant Remitter Brochure
Relevant Remitter BrochureRelevant Remitter Brochure
Relevant Remitter Brochureebstlr
 
Strategic management practices in construction
Strategic management practices in constructionStrategic management practices in construction
Strategic management practices in constructionSapri Pamulu, Ph.D
 
Post Modern Investment Management
Post Modern Investment ManagementPost Modern Investment Management
Post Modern Investment Managementamadei77
 
Post Modern Investment Management - an Overview
Post Modern Investment Management - an OverviewPost Modern Investment Management - an Overview
Post Modern Investment Management - an Overviewamadei77
 

Similaire à A Retrospective Look at A Retrospective Look at Classifier System ResearchClassifier System Research (19)

Dsp black rock us flexible equity fund nfo presentation
Dsp black rock us flexible equity fund   nfo presentationDsp black rock us flexible equity fund   nfo presentation
Dsp black rock us flexible equity fund nfo presentation
 
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
 
Artefatos para gestão de problemas
Artefatos para gestão de problemasArtefatos para gestão de problemas
Artefatos para gestão de problemas
 
Dsp black rock_us_flexible_equity_fund_nfo_presentation3
Dsp black rock_us_flexible_equity_fund_nfo_presentation3Dsp black rock_us_flexible_equity_fund_nfo_presentation3
Dsp black rock_us_flexible_equity_fund_nfo_presentation3
 
Determine Exceptions to Validation
Determine Exceptions to ValidationDetermine Exceptions to Validation
Determine Exceptions to Validation
 
Fsna tool
Fsna toolFsna tool
Fsna tool
 
Panel4 carolinarossini
Panel4 carolinarossiniPanel4 carolinarossini
Panel4 carolinarossini
 
Panel 4 carolina rossini
Panel 4  carolina rossiniPanel 4  carolina rossini
Panel 4 carolina rossini
 
Ethics poll overview for website
Ethics poll overview for websiteEthics poll overview for website
Ethics poll overview for website
 
Agile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoopAgile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoop
 
Associating the Value of Automation with Project Funding
Associating the Value of Automation with Project FundingAssociating the Value of Automation with Project Funding
Associating the Value of Automation with Project Funding
 
Opportunities & Obligations
Opportunities & ObligationsOpportunities & Obligations
Opportunities & Obligations
 
Diskontinuierliche Innovation
Diskontinuierliche InnovationDiskontinuierliche Innovation
Diskontinuierliche Innovation
 
Machine Learning for Speech
Machine Learning for Speech Machine Learning for Speech
Machine Learning for Speech
 
Relevant Remitter Brochure
Relevant Remitter BrochureRelevant Remitter Brochure
Relevant Remitter Brochure
 
Strategic management practices in construction
Strategic management practices in constructionStrategic management practices in construction
Strategic management practices in construction
 
Robert palmer
Robert palmerRobert palmer
Robert palmer
 
Post Modern Investment Management
Post Modern Investment ManagementPost Modern Investment Management
Post Modern Investment Management
 
Post Modern Investment Management - an Overview
Post Modern Investment Management - an OverviewPost Modern Investment Management - an Overview
Post Modern Investment Management - an Overview
 

Plus de Xavier Llorà

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewXavier Llorà
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with MeandreXavier Llorà
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0Xavier Llorà
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine LearningXavier Llorà
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...Xavier Llorà
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsXavier Llorà
 
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...Xavier Llorà
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...Xavier Llorà
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance ProblemsXavier Llorà
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challengesXavier Llorà
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionXavier Llorà
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier SystemsXavier Llorà
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?Xavier Llorà
 
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableLinkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableXavier Llorà
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsXavier Llorà
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring LanguageXavier Llorà
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...Xavier Llorà
 
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Xavier Llorà
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Xavier Llorà
 

Plus de Xavier Llorà (20)

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with Meandre
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
 
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance Problems
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challenges
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier Systems
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?
 
NIGEL 2006 welcome
NIGEL 2006 welcomeNIGEL 2006 welcome
NIGEL 2006 welcome
 
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableLinkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring Language
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
 
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
 

Dernier

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 

Dernier (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

A Retrospective Look at A Retrospective Look at Classifier System ResearchClassifier System Research

  • 1. A Retrospective Look at Classifier System Research Lashon B. Booker The MITRE Corporation © 2006 The MITRE Corporation. All rights reserved.
  • 2. Early Motivations for Learning Classifier System (LCS) Research  Design symbolic problem solvers that avoid brittleness in realistic (uncertain and continually varying) domains involving – On-line, adaptive control of behaviors Representations and procedures must adjust without unnecessarily disrupting existing capabilities – Discovering relevant categories in a complex and unlabeled stream of input Inputs must be incrementally grouped together into plausible classes  This is especially difficult when behavior requires more knowledge representation and processing capability than is available with simple empirical associations between inputs and outputs © 2006 The MITRE Corporation. All rights reserved.
  • 3. Requirements for Non-Brittle Rule-Based Behavior  Need to identify and take advantage of the exploitable regularities in the environment  Generalizations must be selective, pragmatic and subject to exceptions  Learning must be incremental and closely coupled with performance and with unfolding reality  Rules must be treated as tentative hypotheses (not logical assertions) subject to testing and conformation – Hypothesis “strength” is derived from experienced-based predictions of performance – Strength is used to determine rule fitness and infer plausibility © 2006 The MITRE Corporation. All rights reserved.
  • 4. Observations about early research  The Holland and Reitman collaboration placed a strong emphasis on cognition and characterized the problems of interest  Viewed classifier systems as symbolic problem solvers that avoid brittle behavior (an alternative to expert systems) – Treat rule set as a model and rules as parts in a context – Evaluation of parts is context dependent (i.e., aspects are non-stationary)  Learning emphasized policy search and value estimation – Rules are policy elements along with performance estimators – Adjust policy via natural selection among rule types – The Pitt approach preserved this idea, using the GA for direct policy search  Included provisions for motivation, affect and introspection  These ideas provided the foundation for a comprehensive theory of induction (rule clusters, distributed representations, associations, spreading activation, etc.) © 2006 The MITRE Corporation. All rights reserved.
  • 5. Influence of reinforcement learning Reinforcement learning problems are  faced by agents that must learn action sequences from trial-and-error – Framework provides attractive formalisms based on estimating value functions (with Environment key contributions from Sutton and Barto) State – Algorithms provide useful benchmarks for comparisons input Emphasis on value functions has had  Learning a strong influence on LCS research Agent Action – The primary niche is learning compact scalar value function representations for off- feedback policy temporal difference methods – But, the RL community has good alternatives Solution strategies: • Search the space of possible behaviors It is not clear if we are learning the  best generalizations, or giving • Estimate utility of taking actions in sufficient emphasis to policy world states improvement © 2006 The MITRE Corporation. All rights reserved.
  • 6. Value-based generalizations aren’t often intuitive Start 0 0 0 0 0 0 0 0 50 50 75 75 75 50 50 0 50 75 75 50 125 125 250 250 500 500 500 500 250 125 125 125 250 500 500 250 125 1000 1000 1000 1000 1000 1000 Grefenstette’s 9x32 abstract state space There are many obvious intuitive solution strategies:  – E.g. Move left or right to column with highest reward, then go straight Classifier systems tend to learn piecemeal strategies rather than coherent  ones – Many narrowly-focused general rules are needed to get the overall solution – Generalizations correspond to symmetries in the reward distribution e.g., (Row = 111) (Column = #011#)  RIGHT ) not the key attribute-based concepts. – This distinction has been irrelevant in most classifier system test problems (e.g., multiplexor and Woods problems) © 2006 The MITRE Corporation. All rights reserved.
  • 7. Off-policy Methods Learn Different Behaviors Since Q learning is an off-policy method  (i.e., behavior policy may differ from estimation policy), it does not suffer negative consequences for exploration Sarsa (i.e. the bucket brigade) is an on-  policy method, so its solution accounts for the consequences of exploration In real problems where on-line errors  are costly, this distinction is important This also has architectural implications  (e.g., how to approximate the value function) Bottom line: we need to identify and build on the strengths of the LCS approach. The key may be in specifying a set of organizing principles that go beyond implementation diagrams © 2006 The MITRE Corporation. All rights reserved.
  • 8. Soar Architecture of Intelligent Rule-based Behavior I/O Low Faster Intelligence Reaction Deliberation Learning Reflection High Slower Intelligence  Derived by Newell and his students (~1980), also as a response to the expert system phenomenon  Based on a theory of problem solving (i.e., problem spaces), along with a companion view of learning (i.e., chunking)  The theory was operationalized as an architecture that has served that community well © 2006 The MITRE Corporation. All rights reserved.
  • 9. What kind of architecture makes sense for classifier systems? !* !  The key role of policy policy improvement suggests that evaluation an actor-critic structure may value learning be a good start Critic  The idea is to intermix value Actor iteration and policy improvement continually policy (state by state, action by improvement V, *Q * action, sample by sample) greedification V,Q  Is there an organizing principle that extends this concept to cover many forms of induction at different scales? (including perception, reasoning, and action) © 2006 The MITRE Corporation. All rights reserved.
  • 10. DARPA/IPTO Focus on Cognitive Systems Darpa views a cognitive system as one that  – can reason, using substantial amounts of appropriately represented knowledge – can learn from its experience so that it performs better tomorrow than it did today – can explain itself and be told what to do – can be aware of its own capabilities and reflect on its own behavior – can respond robustly to surprise Learning is ubiquitous. Different forms operate at different times and  places What niche is the LCS community best suited to fill?Corporation. All rights reserved.  © 2006 The MITRE
  • 11. Some Open Problems for Reinforcement Learning (Sutton) - and Classifier Systems  Incomplete state information  Exploration  Structured states and actions  Incorporating prior knowledge  Using teachers  Theory of RL with function approximators  Modular and hierarchical architectures  Integration with other problem–solving and planning methods © 2006 The MITRE Corporation. All rights reserved.