SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
A Retrospective Look at
  Classifier System Research



Lashon B. Booker
The MITRE Corporation




                        © 2006 The MITRE Corporation. All rights reserved.
Early Motivations for Learning Classifier
System (LCS) Research

  Design symbolic problem solvers that avoid brittleness in
    realistic (uncertain and continually varying) domains
    involving
    – On-line, adaptive control of behaviors
       Representations and procedures must adjust without unnecessarily
         disrupting existing capabilities
    – Discovering relevant categories in a complex and unlabeled
      stream of input
       Inputs must be incrementally grouped together into plausible classes

  This is especially difficult when behavior requires more
    knowledge representation and processing capability than
    is available with simple empirical associations between
    inputs and outputs


                                                  © 2006 The MITRE Corporation. All rights reserved.
Requirements for Non-Brittle Rule-Based
Behavior

  Need to identify and take advantage of the exploitable
   regularities in the environment
  Generalizations must be selective, pragmatic and subject
   to exceptions
  Learning must be incremental and closely coupled with
   performance and with unfolding reality
  Rules must be treated as tentative hypotheses (not logical
   assertions) subject to testing and conformation
    – Hypothesis “strength” is derived from experienced-based
      predictions of performance
    – Strength is used to determine rule fitness and infer plausibility



                                                © 2006 The MITRE Corporation. All rights reserved.
Observations about early research
 The Holland and Reitman collaboration placed a strong
   emphasis on cognition and characterized the problems of
   interest
 Viewed classifier systems as symbolic problem solvers that
   avoid brittle behavior (an alternative to expert systems)
   –   Treat rule set as a model and rules as parts in a context
   –   Evaluation of parts is context dependent (i.e., aspects are non-stationary)

 Learning emphasized policy search and value estimation
   –   Rules are policy elements along with performance estimators
   –   Adjust policy via natural selection among rule types
   –   The Pitt approach preserved this idea, using the GA for direct policy search

 Included provisions for motivation, affect and introspection

 These ideas provided the foundation for a comprehensive theory
   of induction (rule clusters, distributed representations,
   associations, spreading activation, etc.)

                                                                   © 2006 The MITRE Corporation. All rights reserved.
Influence of reinforcement learning

                                               Reinforcement learning problems are
                                           
                                               faced by agents that must learn action
                                               sequences from trial-and-error
                                                – Framework provides attractive formalisms
                                                  based on estimating value functions (with
         Environment                              key contributions from Sutton and Barto)
State                                           – Algorithms provide useful benchmarks for
                                                  comparisons
        input
                                               Emphasis on value functions has had
                                           
                 Learning
                                               a strong influence on LCS research
                  Agent
                                 Action         – The primary niche is learning compact
      scalar
                                                  value function representations for off-
    feedback
                                                  policy temporal difference methods
                                                – But, the RL community has good
                                                  alternatives
Solution strategies:
• Search the space of possible behaviors       It is not clear if we are learning the
                                           
                                               best generalizations, or giving
• Estimate utility of taking actions in
                                               sufficient emphasis to policy
world states
                                               improvement © 2006 The MITRE Corporation. All rights reserved.
Value-based generalizations aren’t often intuitive
                                                                                                        Start




                  0
                  0




                                                                                            0
                                                                                                0


                                                                                                        0




                                                                                                                                                                                  0
                                                                                                                                                                                      0
              0


                      50
                      50
                           75
                           75




                                                                                  75
                                                                                       50
                                                                                       50




                                                                                                    0


                                                                                                            50
                                                                                                                 75




                                                                                                                                                                        75
                                                                                                                                                                             50
                                125
                                125
                                      250
                                      250
                                            500
                                            500



                                                                500
                                                                500
                                                                      250
                                                                            125
                                                                            125




                                                                                                                      125
                                                                                                                            250
                                                                                                                                  500



                                                                                                                                                      500
                                                                                                                                                            250
                                                                                                                                                                  125
                                                  1000
                                                  1000
                                                         1000
                                                         1000




                                                                                                                                        1000
                                                                                                                                               1000
                                                           Grefenstette’s 9x32 abstract state space

       There are many obvious intuitive solution strategies:
   
        –   E.g. Move left or right to column with highest reward, then go straight

       Classifier systems tend to learn piecemeal strategies rather than coherent
   
       ones
        –   Many narrowly-focused general rules are needed to get the overall solution
        –   Generalizations correspond to symmetries in the reward distribution
                 e.g., (Row = 111) (Column = #011#)  RIGHT )
            not the key attribute-based concepts.
        –   This distinction has been irrelevant in most classifier system test problems (e.g.,
            multiplexor and Woods problems)
                                                                                                                                                                   © 2006 The MITRE Corporation. All rights reserved.
Off-policy Methods Learn Different
Behaviors

                                    Since Q learning is an off-policy method
                                
                                    (i.e., behavior policy may differ from
                                    estimation policy), it does not suffer
                                    negative consequences for exploration
                                    Sarsa (i.e. the bucket brigade) is an on-
                                
                                    policy method, so its solution accounts
                                    for the consequences of exploration
                                    In real problems where on-line errors
                                
                                    are costly, this distinction is important
                                    This also has architectural implications
                                
                                    (e.g., how to approximate the value
                                    function)



 Bottom line: we need to identify and build on the strengths of the LCS
    approach. The key may be in specifying a set of organizing principles
    that go beyond implementation diagrams
                                                  © 2006 The MITRE Corporation. All rights reserved.
Soar Architecture of Intelligent Rule-based
Behavior

                                                 I/O

                    Low                                              Faster
                Intelligence
                                           Reaction



                                          Deliberation
                               Learning




                                           Reflection
                    High
                                                                    Slower
                Intelligence



  Derived by Newell and his students (~1980), also as a response
    to the expert system phenomenon
  Based on a theory of problem solving (i.e., problem spaces),
    along with a companion view of learning (i.e., chunking)
  The theory was operationalized as an architecture that has
    served that community well

                                                         © 2006 The MITRE Corporation. All rights reserved.
What kind of architecture makes sense for
classifier systems?
     !*
     !
                                       The key role of policy
             policy
                                        improvement suggests that
           evaluation
                                        an actor-critic structure may
            value
           learning
                                        be a good start
                            Critic
                                       The idea is to intermix value
   Actor
                                        iteration and policy
                                        improvement continually
              policy
                                        (state by state, action by
           improvement
                            V, *Q *     action, sample by sample)
           greedification
                             V,Q
                                       Is there an organizing
                                        principle that extends this
                                        concept to cover many forms
                                        of induction at different
                                        scales? (including
                                        perception, reasoning, and
                                        action)

                                              © 2006 The MITRE Corporation. All rights reserved.
DARPA/IPTO Focus on Cognitive Systems




       Darpa views a cognitive system as one that
   
        –   can reason, using substantial amounts of appropriately represented knowledge
        –   can learn from its experience so that it performs better tomorrow than it did today
        –   can explain itself and be told what to do
        –   can be aware of its own capabilities and reflect on its own behavior
        –   can respond robustly to surprise
       Learning is ubiquitous. Different forms operate at different times and
   
       places
       What niche is the LCS community best suited to fill?Corporation. All rights reserved.
                                                © 2006 The MITRE
Some Open Problems for Reinforcement
Learning (Sutton) - and Classifier Systems

    Incomplete state information
    Exploration
    Structured states and actions
    Incorporating prior knowledge
    Using teachers
    Theory of RL with function approximators
    Modular and hierarchical architectures
    Integration with other problem–solving and
     planning methods




                                          © 2006 The MITRE Corporation. All rights reserved.

Contenu connexe

Similaire à A Retrospective Look at A Retrospective Look at Classifier System ResearchClassifier System Research

Dsp black rock us flexible equity fund nfo presentation
Dsp black rock us flexible equity fund   nfo presentationDsp black rock us flexible equity fund   nfo presentation
Dsp black rock us flexible equity fund nfo presentationPrajna Capital
 
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...David Parsons
 
Artefatos para gestão de problemas
Artefatos para gestão de problemasArtefatos para gestão de problemas
Artefatos para gestão de problemasFernando Palma
 
Dsp black rock_us_flexible_equity_fund_nfo_presentation3
Dsp black rock_us_flexible_equity_fund_nfo_presentation3Dsp black rock_us_flexible_equity_fund_nfo_presentation3
Dsp black rock_us_flexible_equity_fund_nfo_presentation3sudhanshuarora1
 
Panel4 carolinarossini
Panel4 carolinarossiniPanel4 carolinarossini
Panel4 carolinarossiniREA Brasil
 
Ethics poll overview for website
Ethics poll overview for websiteEthics poll overview for website
Ethics poll overview for websitePriority Thinking
 
Agile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoopAgile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoopDataWorks Summit
 
Associating the Value of Automation with Project Funding
Associating the Value of Automation with Project FundingAssociating the Value of Automation with Project Funding
Associating the Value of Automation with Project FundingARC Advisory Group
 
Opportunities & Obligations
Opportunities & ObligationsOpportunities & Obligations
Opportunities & ObligationsMartin Rehm
 
Diskontinuierliche Innovation
Diskontinuierliche InnovationDiskontinuierliche Innovation
Diskontinuierliche Innovationintegro
 
Machine Learning for Speech
Machine Learning for Speech Machine Learning for Speech
Machine Learning for Speech butest
 
Relevant Remitter Brochure
Relevant Remitter BrochureRelevant Remitter Brochure
Relevant Remitter Brochureebstlr
 
Strategic management practices in construction
Strategic management practices in constructionStrategic management practices in construction
Strategic management practices in constructionSapri Pamulu, Ph.D
 
Post Modern Investment Management - an Overview
Post Modern Investment Management - an OverviewPost Modern Investment Management - an Overview
Post Modern Investment Management - an Overviewamadei77
 
Post Modern Investment Management
Post Modern Investment ManagementPost Modern Investment Management
Post Modern Investment Managementamadei77
 

Similaire à A Retrospective Look at A Retrospective Look at Classifier System ResearchClassifier System Research (19)

Dsp black rock us flexible equity fund nfo presentation
Dsp black rock us flexible equity fund   nfo presentationDsp black rock us flexible equity fund   nfo presentation
Dsp black rock us flexible equity fund nfo presentation
 
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
The Impact of Methods and Techniques on Outcomes from Agile Software Developm...
 
Artefatos para gestão de problemas
Artefatos para gestão de problemasArtefatos para gestão de problemas
Artefatos para gestão de problemas
 
Dsp black rock_us_flexible_equity_fund_nfo_presentation3
Dsp black rock_us_flexible_equity_fund_nfo_presentation3Dsp black rock_us_flexible_equity_fund_nfo_presentation3
Dsp black rock_us_flexible_equity_fund_nfo_presentation3
 
Determine Exceptions to Validation
Determine Exceptions to ValidationDetermine Exceptions to Validation
Determine Exceptions to Validation
 
Fsna tool
Fsna toolFsna tool
Fsna tool
 
Panel4 carolinarossini
Panel4 carolinarossiniPanel4 carolinarossini
Panel4 carolinarossini
 
Panel 4 carolina rossini
Panel 4  carolina rossiniPanel 4  carolina rossini
Panel 4 carolina rossini
 
Ethics poll overview for website
Ethics poll overview for websiteEthics poll overview for website
Ethics poll overview for website
 
Agile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoopAgile deployment predictive analytics on hadoop
Agile deployment predictive analytics on hadoop
 
Associating the Value of Automation with Project Funding
Associating the Value of Automation with Project FundingAssociating the Value of Automation with Project Funding
Associating the Value of Automation with Project Funding
 
Opportunities & Obligations
Opportunities & ObligationsOpportunities & Obligations
Opportunities & Obligations
 
Diskontinuierliche Innovation
Diskontinuierliche InnovationDiskontinuierliche Innovation
Diskontinuierliche Innovation
 
Machine Learning for Speech
Machine Learning for Speech Machine Learning for Speech
Machine Learning for Speech
 
Relevant Remitter Brochure
Relevant Remitter BrochureRelevant Remitter Brochure
Relevant Remitter Brochure
 
Strategic management practices in construction
Strategic management practices in constructionStrategic management practices in construction
Strategic management practices in construction
 
Robert palmer
Robert palmerRobert palmer
Robert palmer
 
Post Modern Investment Management - an Overview
Post Modern Investment Management - an OverviewPost Modern Investment Management - an Overview
Post Modern Investment Management - an Overview
 
Post Modern Investment Management
Post Modern Investment ManagementPost Modern Investment Management
Post Modern Investment Management
 

Plus de Xavier Llorà

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewXavier Llorà
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with MeandreXavier Llorà
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0Xavier Llorà
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine LearningXavier Llorà
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...Xavier Llorà
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsXavier Llorà
 
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...Xavier Llorà
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...Xavier Llorà
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance ProblemsXavier Llorà
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challengesXavier Llorà
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionXavier Llorà
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier SystemsXavier Llorà
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?Xavier Llorà
 
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableLinkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableXavier Llorà
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsXavier Llorà
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring LanguageXavier Llorà
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...Xavier Llorà
 
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Xavier Llorà
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Xavier Llorà
 

Plus de Xavier Llorà (20)

Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
 
Soaring the Clouds with Meandre
Soaring the Clouds with MeandreSoaring the Clouds with Meandre
Soaring the Clouds with Meandre
 
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
From Galapagos to Twitter: Darwin, Natural Selection, and Web 2.0
 
Large Scale Data Mining using Genetics-Based Machine Learning
Large Scale Data Mining using   Genetics-Based Machine LearningLarge Scale Data Mining using   Genetics-Based Machine Learning
Large Scale Data Mining using Genetics-Based Machine Learning
 
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...Data-Intensive Computing for  Competent Genetic Algorithms:  A Pilot Study us...
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...
 
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new TrendsScalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
Scalabiltity in GBML, Accuracy-based Michigan Fuzzy LCS, and new Trends
 
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...Pittsburgh Learning Classifier Systems for Protein  Structure Prediction: Sca...
Pittsburgh Learning Classifier Systems for Protein Structure Prediction: Sca...
 
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...Towards a Theoretical  Towards a Theoretical  Framework for LCS  Framework fo...
Towards a Theoretical Towards a Theoretical Framework for LCS Framework fo...
 
Learning Classifier Systems for Class Imbalance Problems
Learning Classifier Systems  for Class Imbalance  ProblemsLearning Classifier Systems  for Class Imbalance  Problems
Learning Classifier Systems for Class Imbalance Problems
 
XCS: Current capabilities and future challenges
XCS: Current capabilities and future  challengesXCS: Current capabilities and future  challenges
XCS: Current capabilities and future challenges
 
Negative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly DetectionNegative Selection for Algorithm for Anomaly Detection
Negative Selection for Algorithm for Anomaly Detection
 
Searle, Intentionality, and the Future of Classifier Systems
Searle, Intentionality, and the  Future of Classifier SystemsSearle, Intentionality, and the  Future of Classifier Systems
Searle, Intentionality, and the Future of Classifier Systems
 
Computed Prediction: So far, so good. What now?
Computed Prediction:  So far, so good. What now?Computed Prediction:  So far, so good. What now?
Computed Prediction: So far, so good. What now?
 
NIGEL 2006 welcome
NIGEL 2006 welcomeNIGEL 2006 welcome
NIGEL 2006 welcome
 
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems TractableLinkage Learning for Pittsburgh LCS: Making Problems Tractable
Linkage Learning for Pittsburgh LCS: Making Problems Tractable
 
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the CloudsMeandre: Semantic-Driven Data-Intensive Flows in the Clouds
Meandre: Semantic-Driven Data-Intensive Flows in the Clouds
 
ZigZag: The Meandring Language
ZigZag: The Meandring LanguageZigZag: The Meandring Language
ZigZag: The Meandring Language
 
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
HUMIES 2007 Bronze Winner: Towards Better than Human Capability in Diagnosing...
 
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...
 
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infr...
 

Dernier

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Dernier (20)

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

A Retrospective Look at A Retrospective Look at Classifier System ResearchClassifier System Research

  • 1. A Retrospective Look at Classifier System Research Lashon B. Booker The MITRE Corporation © 2006 The MITRE Corporation. All rights reserved.
  • 2. Early Motivations for Learning Classifier System (LCS) Research  Design symbolic problem solvers that avoid brittleness in realistic (uncertain and continually varying) domains involving – On-line, adaptive control of behaviors Representations and procedures must adjust without unnecessarily disrupting existing capabilities – Discovering relevant categories in a complex and unlabeled stream of input Inputs must be incrementally grouped together into plausible classes  This is especially difficult when behavior requires more knowledge representation and processing capability than is available with simple empirical associations between inputs and outputs © 2006 The MITRE Corporation. All rights reserved.
  • 3. Requirements for Non-Brittle Rule-Based Behavior  Need to identify and take advantage of the exploitable regularities in the environment  Generalizations must be selective, pragmatic and subject to exceptions  Learning must be incremental and closely coupled with performance and with unfolding reality  Rules must be treated as tentative hypotheses (not logical assertions) subject to testing and conformation – Hypothesis “strength” is derived from experienced-based predictions of performance – Strength is used to determine rule fitness and infer plausibility © 2006 The MITRE Corporation. All rights reserved.
  • 4. Observations about early research  The Holland and Reitman collaboration placed a strong emphasis on cognition and characterized the problems of interest  Viewed classifier systems as symbolic problem solvers that avoid brittle behavior (an alternative to expert systems) – Treat rule set as a model and rules as parts in a context – Evaluation of parts is context dependent (i.e., aspects are non-stationary)  Learning emphasized policy search and value estimation – Rules are policy elements along with performance estimators – Adjust policy via natural selection among rule types – The Pitt approach preserved this idea, using the GA for direct policy search  Included provisions for motivation, affect and introspection  These ideas provided the foundation for a comprehensive theory of induction (rule clusters, distributed representations, associations, spreading activation, etc.) © 2006 The MITRE Corporation. All rights reserved.
  • 5. Influence of reinforcement learning Reinforcement learning problems are  faced by agents that must learn action sequences from trial-and-error – Framework provides attractive formalisms based on estimating value functions (with Environment key contributions from Sutton and Barto) State – Algorithms provide useful benchmarks for comparisons input Emphasis on value functions has had  Learning a strong influence on LCS research Agent Action – The primary niche is learning compact scalar value function representations for off- feedback policy temporal difference methods – But, the RL community has good alternatives Solution strategies: • Search the space of possible behaviors It is not clear if we are learning the  best generalizations, or giving • Estimate utility of taking actions in sufficient emphasis to policy world states improvement © 2006 The MITRE Corporation. All rights reserved.
  • 6. Value-based generalizations aren’t often intuitive Start 0 0 0 0 0 0 0 0 50 50 75 75 75 50 50 0 50 75 75 50 125 125 250 250 500 500 500 500 250 125 125 125 250 500 500 250 125 1000 1000 1000 1000 1000 1000 Grefenstette’s 9x32 abstract state space There are many obvious intuitive solution strategies:  – E.g. Move left or right to column with highest reward, then go straight Classifier systems tend to learn piecemeal strategies rather than coherent  ones – Many narrowly-focused general rules are needed to get the overall solution – Generalizations correspond to symmetries in the reward distribution e.g., (Row = 111) (Column = #011#)  RIGHT ) not the key attribute-based concepts. – This distinction has been irrelevant in most classifier system test problems (e.g., multiplexor and Woods problems) © 2006 The MITRE Corporation. All rights reserved.
  • 7. Off-policy Methods Learn Different Behaviors Since Q learning is an off-policy method  (i.e., behavior policy may differ from estimation policy), it does not suffer negative consequences for exploration Sarsa (i.e. the bucket brigade) is an on-  policy method, so its solution accounts for the consequences of exploration In real problems where on-line errors  are costly, this distinction is important This also has architectural implications  (e.g., how to approximate the value function) Bottom line: we need to identify and build on the strengths of the LCS approach. The key may be in specifying a set of organizing principles that go beyond implementation diagrams © 2006 The MITRE Corporation. All rights reserved.
  • 8. Soar Architecture of Intelligent Rule-based Behavior I/O Low Faster Intelligence Reaction Deliberation Learning Reflection High Slower Intelligence  Derived by Newell and his students (~1980), also as a response to the expert system phenomenon  Based on a theory of problem solving (i.e., problem spaces), along with a companion view of learning (i.e., chunking)  The theory was operationalized as an architecture that has served that community well © 2006 The MITRE Corporation. All rights reserved.
  • 9. What kind of architecture makes sense for classifier systems? !* !  The key role of policy policy improvement suggests that evaluation an actor-critic structure may value learning be a good start Critic  The idea is to intermix value Actor iteration and policy improvement continually policy (state by state, action by improvement V, *Q * action, sample by sample) greedification V,Q  Is there an organizing principle that extends this concept to cover many forms of induction at different scales? (including perception, reasoning, and action) © 2006 The MITRE Corporation. All rights reserved.
  • 10. DARPA/IPTO Focus on Cognitive Systems Darpa views a cognitive system as one that  – can reason, using substantial amounts of appropriately represented knowledge – can learn from its experience so that it performs better tomorrow than it did today – can explain itself and be told what to do – can be aware of its own capabilities and reflect on its own behavior – can respond robustly to surprise Learning is ubiquitous. Different forms operate at different times and  places What niche is the LCS community best suited to fill?Corporation. All rights reserved.  © 2006 The MITRE
  • 11. Some Open Problems for Reinforcement Learning (Sutton) - and Classifier Systems  Incomplete state information  Exploration  Structured states and actions  Incorporating prior knowledge  Using teachers  Theory of RL with function approximators  Modular and hierarchical architectures  Integration with other problem–solving and planning methods © 2006 The MITRE Corporation. All rights reserved.