SlideShare une entreprise Scribd logo
1  sur  22
Emotion-Driven
Reinforcement Learning
Bob Marinier & John Laird
University of Michigan, Computer Science and Engineering
CogSci’08
2




Introduction
• Interested in the functional benefits of emotion
  for a cognitive agent
 ▫ Appraisal theories of emotion
 ▫ PEACTIDM theory of cognitive control
• Use emotion as a reward signal to a
  reinforcement learning agent
 ▫ Demonstrates a functional benefit of emotion
 ▫ Provides a theory of the origin of intrinsic reward
3




Outline
• Background
 ▫ Integration of emotion and cognition
 ▫ Integration of emotion and reinforcement learning
 ▫ Implementation in Soar
• Learning task
• Results
4



Appraisal Theories of Emotion
 • A situation is evaluated along a number of appraisal
   dimensions, many of which relate the situation to
   current goals
   ▫ Novelty, goal relevance, goal conduciveness, expectedness,
     causal agency, etc.
 • Appraisals influence emotion
 • Emotion can then be coped with (via internal or
   external actions)
                          Situation
                            Goals


              Coping                  Appraisals


                          Emotion
5


  Appraisals to Emotions (Scherer 2001)
                         Joy                  Fear           Anger
                         High/medium          High           High
Suddenness
                         High                 High           High
Unpredictability
                                              Low
Intrinsic pleasantness
                         High                 High           High
Goal/need relevance
                                              Other/nature   Other
Cause: agent
                         Chance/intentional                  Intentional
Cause: motive
                         Very high            High           Very high
Outcome probability
Discrepancy from                              High           High
expectation
                         Very high            Low            Low
Conduciveness
                                                             High
Control
                                              Very low       High
Power
6



Cognitive Control: PEACTIDM (Newell 1990)
Perceive      Obtain raw perception
Encode     Create domain-independent
           representation
Attend     Choose stimulus to process
Comprehend Generate structures that relate stimulus
           to tasks and can be used to inform
           behavior
Task       Perform task maintenance
Intend     Choose an action, create prediction
Decode     Decompose action into motor commands

Motor         Execute motor commands
7



Unification of PEACTIDM and Appraisal Theories

                                 Perceive
          Environmental                                          Raw Perceptual
             Change                                               Information




             Motor                                               Encode
                                                           Suddenness
                                                                           Stimulus
                                                      Unpredictability
           Motor                                                           Relevance
                                                       Goal Relevance
        Commands                                Intrinsic Pleasantness
                                  Prediction

                                   Outcome
             Decode                                               Attend
                                  Probability


                              Causal Agent/Motive
            Action                                                       Stimulus chosen
                                  Discrepancy
                                                                          for processing
                                Conduciveness
                                 Control/Power

                     Intend                           Comprehend
                               Current Situation
                                 Assessment
8




Distinction between emotion, mood, and feeling
(Marinier & Laird 2007)
  • Emotion: Result of appraisals
    ▫ Is about the current situation
  • Mood: “Average” over recent emotions
    ▫ Provides historical context
  • Feeling: Emotion “+” Mood
    ▫ What agent actually perceives
10

 Intrinsically Motivated Reinforcement Learning
 (Sutton & Barto 1998; Singh et al. 2004)
                                             External
                                           Environment
          Environment
                                 Actions                  Sensations

             Critic
                                             Internal
                                           Environment
                                            Appraisal
Actions    Rewards      States                Critic
                                             Process

                                           +/- Feeling
                                  Decisions Rewards States
                                            Intensity
             Agent


                                              Agent
                                                         “Organism”


                 • Reward = Intensity * Valence
11


Extending Soar with Emotion
(Marinier & Laird 2007)
                                           Symbolic Long-Term Memories
                     Procedural                                                      Episodic
                                                     Semantic




             Reinforcement Chunking                                                     Episodic
                                                                Semantic
               Learning                                                                 Learning
                                                                Learning




                                                Short-Term Memory
                          Appraisal
                          Detector                                                   Decision
                                                                                    Procedure
                                                  Situation, Goals




                                                      Visual
                                  Perception                               Action
                                                     Imagery

                                                       Body
12


       Extending Soar with Emotion
       (Marinier & Laird 2007)
                                                                                                           Symbolic Long-Term Memories
                                                                                         Procedural                                                  Episodic
                                                                                                                     Semantic




                                                                            Reinforcement Chunking                                                      Episodic
                                                                                                                                Semantic
                                                                              Learning                                                                  Learning
                                                                                                                                Learning
Appraisal Detector




                                                       Feeling
                                                  .9,.6,.5,-.1,.8,…
                                                                                                                Short-Term Memory
                                                                                                                                                     Decision
                                                                                                                     Feelings                       Procedure
                                                                                                                  Situation, Goals
                                                                          Emotion
                                    Mood
                                                                      .5,.7,0,-.4,.3,…
                              .7,-.2,.8,.3,.6,…


                                                                                                                      Visual
                                                                                                  Perception                               Action
                                                                                                                     Imagery

                                                                                                                       Body
                     Knowledge

                     Architecture
13



Learning task


Start



                Goal
14



Learning task: Encoding
                       North
                       Passable: false
                       On path: false
                       Progress: true

                                         East
     West
                                         Passable: false
     Passable: false
                                         On path: true
     On path: false
                                         Progress: true
     Progress: true

                       South
                       Passable: true
                       On path: true
                       Progress: true
15



Learning task: Encoding & Appraisal
                              North
                              Intrinsic Pleasantness: Low
                              Goal Relevance: Low
                              Unpredictability: High

                                                East
West
                                                Intrinsic Pleasantness: Low
Intrinsic Pleasantness: Low
                                                Goal Relevance: High
Goal Relevance: Low
                                                Unpredictability: High
Unpredictability: High

                              South
                              Intrinsic Pleasantness: Neutral
                              Goal Relevance: High
                              Unpredictability: Low
16


Learning task: Attending,
Comprehending & Appraisal




            South
            Intrinsic Pleasantness: Neutral
            Goal Relevance: High
            Unpredictability: Low
            Conduciveness: High
            Control: High …
17



Learning task: Tasking
18



Learning task: Tasking




             Optimal Subtasks
19




What is being learned?
•   When to Attend vs Task
•   If Attending, what to Attend to
•   If Tasking, which subtask to create
•   When to Intend vs. Ignore
20


                             Learning Results
                           12000
Median Processing Cycles




                           10000

                           8000

                           6000

                           4000

                           2000

                              0
                                   1   2     3   4   5   6   7     8   9   10   11   12   13   14   15
                                                             Episode
                               Standard RL       Feeling=Emotion       Feeling=Emotion+Mood
21




                     Results: With and without mood
                           300
Median Processing Cycles




                           290

                           280

                           270

                           260

                           250

                           240
                                 8       9         10       11     12      13     14      15
                                                             Episode
                                 Feeling=Emotion        Feeling=Emotion+Mood    Optimal
22




Discussion
• Agent learns both internal (tasking) and external
  (movement) actions
• Emotion allows for more frequent rewards, and
  thus learns faster than standard RL
• Mood “fills in the gaps” allowing for even faster
  learning and less variability
23




Conclusion & Future Work
• Demonstrated computational model that integrates
  emotion and cognitive control
• Confirmed emotion can drive reinforcement learning
• We have already successfully demonstrated similar
  learning in a more complex domain
• Would like to explore multi-agent scenarios

Contenu connexe

Tendances

Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012vivek_shaw
 
Natural Rationality: Beyond Bounded and Ecological Rationality
Natural Rationality: Beyond Bounded and Ecological RationalityNatural Rationality: Beyond Bounded and Ecological Rationality
Natural Rationality: Beyond Bounded and Ecological RationalityBenoit Hardy-Vallée, Ph.D.
 
5004 implementing aggregate_awareness_in_sap_business_objects
5004 implementing aggregate_awareness_in_sap_business_objects5004 implementing aggregate_awareness_in_sap_business_objects
5004 implementing aggregate_awareness_in_sap_business_objectsYogeeswar Reddy
 
YGCC case interview guide
YGCC case interview guideYGCC case interview guide
YGCC case interview guideYGCC
 
solving problems
solving problemssolving problems
solving problemsnhok maruko
 
Neuromarketing the hope and hype of neuroimaging in business
Neuromarketing  the hope and hype of neuroimaging in businessNeuromarketing  the hope and hype of neuroimaging in business
Neuromarketing the hope and hype of neuroimaging in businessAnna Jo
 
I think...therefore IM
I think...therefore IMI think...therefore IM
I think...therefore IMKevin McGrew
 
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning knowledge Technology Week
 
The Science of Listening
The Science of ListeningThe Science of Listening
The Science of ListeningMark Robinson
 
Key Point Sampler 2011
Key Point Sampler 2011Key Point Sampler 2011
Key Point Sampler 2011dsandlerny
 
Yonce clay
Yonce clayYonce clay
Yonce clayNASAPMC
 
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...IT Network marcus evans
 
08 rita schoeny
08 rita schoeny08 rita schoeny
08 rita schoenyradarrt
 
Telephone improvement project–year 2 ongoing assessment of refractive surgery...
Telephone improvement project–year 2 ongoing assessment of refractive surgery...Telephone improvement project–year 2 ongoing assessment of refractive surgery...
Telephone improvement project–year 2 ongoing assessment of refractive surgery...SM2 Strategic
 
malabika executive functions
malabika   executive functionsmalabika   executive functions
malabika executive functionsCOT SSNP
 

Tendances (17)

Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012
 
Natural Rationality: Beyond Bounded and Ecological Rationality
Natural Rationality: Beyond Bounded and Ecological RationalityNatural Rationality: Beyond Bounded and Ecological Rationality
Natural Rationality: Beyond Bounded and Ecological Rationality
 
Social Loafing
Social LoafingSocial Loafing
Social Loafing
 
5004 implementing aggregate_awareness_in_sap_business_objects
5004 implementing aggregate_awareness_in_sap_business_objects5004 implementing aggregate_awareness_in_sap_business_objects
5004 implementing aggregate_awareness_in_sap_business_objects
 
YGCC case interview guide
YGCC case interview guideYGCC case interview guide
YGCC case interview guide
 
solving problems
solving problemssolving problems
solving problems
 
Neuromarketing the hope and hype of neuroimaging in business
Neuromarketing  the hope and hype of neuroimaging in businessNeuromarketing  the hope and hype of neuroimaging in business
Neuromarketing the hope and hype of neuroimaging in business
 
I think...therefore IM
I think...therefore IMI think...therefore IM
I think...therefore IM
 
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
Empirical Game-Theoretic Analysis for Practical Strategic Reasoning
 
The Science of Listening
The Science of ListeningThe Science of Listening
The Science of Listening
 
Key Point Sampler 2011
Key Point Sampler 2011Key Point Sampler 2011
Key Point Sampler 2011
 
Yonce clay
Yonce clayYonce clay
Yonce clay
 
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
A CIO’s Perspective: Reconciling Risk Management with Disaster Recovery Tacti...
 
Sme
SmeSme
Sme
 
08 rita schoeny
08 rita schoeny08 rita schoeny
08 rita schoeny
 
Telephone improvement project–year 2 ongoing assessment of refractive surgery...
Telephone improvement project–year 2 ongoing assessment of refractive surgery...Telephone improvement project–year 2 ongoing assessment of refractive surgery...
Telephone improvement project–year 2 ongoing assessment of refractive surgery...
 
malabika executive functions
malabika   executive functionsmalabika   executive functions
malabika executive functions
 

En vedette (6)

Improving Findability Inside the Firewall
Improving Findability Inside the FirewallImproving Findability Inside the Firewall
Improving Findability Inside the Firewall
 
State of Social Media 2013
State of Social Media 2013State of Social Media 2013
State of Social Media 2013
 
La ley SOPA
La ley SOPALa ley SOPA
La ley SOPA
 
Beyond Boolean - Enterprise Search Technologies
Beyond Boolean - Enterprise Search TechnologiesBeyond Boolean - Enterprise Search Technologies
Beyond Boolean - Enterprise Search Technologies
 
US Trip Sharing
US Trip Sharing US Trip Sharing
US Trip Sharing
 
J Welch Skills1
J Welch Skills1J Welch Skills1
J Welch Skills1
 

Similaire à Marinier Laird Cogsci 2008 Emotionrl Pres

Organizational behaviour Perception & Cognition, lecture two
Organizational behaviour Perception & Cognition, lecture twoOrganizational behaviour Perception & Cognition, lecture two
Organizational behaviour Perception & Cognition, lecture twoMurray Hunter
 
David Bennet KMME 2013
David Bennet KMME 2013David Bennet KMME 2013
David Bennet KMME 2013KMMiddleEast
 
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...Michael Burnett
 
Theories of work motivation
Theories of work motivationTheories of work motivation
Theories of work motivationMansi Khurana
 
PERCEPTION AND INDIVIDUAL DECISION MAKING
PERCEPTION AND INDIVIDUAL DECISION MAKINGPERCEPTION AND INDIVIDUAL DECISION MAKING
PERCEPTION AND INDIVIDUAL DECISION MAKINGAli Zeeshan
 
Neural mechanisms of decision making - emotion vs. cognition
Neural mechanisms of decision making - emotion vs. cognitionNeural mechanisms of decision making - emotion vs. cognition
Neural mechanisms of decision making - emotion vs. cognitionKyongsik Yun
 
Building Competencies Ihrd Conf Presentation Chandramowly
Building Competencies Ihrd Conf Presentation ChandramowlyBuilding Competencies Ihrd Conf Presentation Chandramowly
Building Competencies Ihrd Conf Presentation Chandramowlygueste6e6f5f
 
Dave snowden practice without sound theory will not scale
Dave snowden   practice without sound theory will not scaleDave snowden   practice without sound theory will not scale
Dave snowden practice without sound theory will not scaleAGILEMinds
 
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B..." Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...SeriousGamesAssoc
 
Diagnosing behavioral problems and perception
Diagnosing behavioral problems and perceptionDiagnosing behavioral problems and perception
Diagnosing behavioral problems and perceptionEui Jung Hwang
 
The Power of Relevancy The Biometric Impact of Online Advertising
The Power of Relevancy The Biometric Impact of Online AdvertisingThe Power of Relevancy The Biometric Impact of Online Advertising
The Power of Relevancy The Biometric Impact of Online AdvertisingThe Advertising Research Foundation
 
Representing Situations in Assessment - Getting better value from our investment
Representing Situations in Assessment - Getting better value from our investmentRepresenting Situations in Assessment - Getting better value from our investment
Representing Situations in Assessment - Getting better value from our investmentMichael Burnett
 
Augmented Reality: Beyond Usability
Augmented Reality: Beyond UsabilityAugmented Reality: Beyond Usability
Augmented Reality: Beyond UsabilityPamela Rutledge
 
Dissociation of neural networks for anticipation and consumption
Dissociation of neural networks for anticipation and consumptionDissociation of neural networks for anticipation and consumption
Dissociation of neural networks for anticipation and consumptionAnna Jo
 

Similaire à Marinier Laird Cogsci 2008 Emotionrl Pres (19)

201106 G4C
201106 G4C201106 G4C
201106 G4C
 
Teambuilding Exercises
Teambuilding ExercisesTeambuilding Exercises
Teambuilding Exercises
 
Organizational behaviour Perception & Cognition, lecture two
Organizational behaviour Perception & Cognition, lecture twoOrganizational behaviour Perception & Cognition, lecture two
Organizational behaviour Perception & Cognition, lecture two
 
David Bennet KMME 2013
David Bennet KMME 2013David Bennet KMME 2013
David Bennet KMME 2013
 
Perception
PerceptionPerception
Perception
 
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
Encoding & decoding Situations: Presentation to Division of Occupational Psyc...
 
Perception
PerceptionPerception
Perception
 
Perception
PerceptionPerception
Perception
 
Theories of work motivation
Theories of work motivationTheories of work motivation
Theories of work motivation
 
PERCEPTION AND INDIVIDUAL DECISION MAKING
PERCEPTION AND INDIVIDUAL DECISION MAKINGPERCEPTION AND INDIVIDUAL DECISION MAKING
PERCEPTION AND INDIVIDUAL DECISION MAKING
 
Neural mechanisms of decision making - emotion vs. cognition
Neural mechanisms of decision making - emotion vs. cognitionNeural mechanisms of decision making - emotion vs. cognition
Neural mechanisms of decision making - emotion vs. cognition
 
Building Competencies Ihrd Conf Presentation Chandramowly
Building Competencies Ihrd Conf Presentation ChandramowlyBuilding Competencies Ihrd Conf Presentation Chandramowly
Building Competencies Ihrd Conf Presentation Chandramowly
 
Dave snowden practice without sound theory will not scale
Dave snowden   practice without sound theory will not scaleDave snowden   practice without sound theory will not scale
Dave snowden practice without sound theory will not scale
 
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B..." Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
" Optimizing Motivation, Learning and Behavior Change in your Serious Game" B...
 
Diagnosing behavioral problems and perception
Diagnosing behavioral problems and perceptionDiagnosing behavioral problems and perception
Diagnosing behavioral problems and perception
 
The Power of Relevancy The Biometric Impact of Online Advertising
The Power of Relevancy The Biometric Impact of Online AdvertisingThe Power of Relevancy The Biometric Impact of Online Advertising
The Power of Relevancy The Biometric Impact of Online Advertising
 
Representing Situations in Assessment - Getting better value from our investment
Representing Situations in Assessment - Getting better value from our investmentRepresenting Situations in Assessment - Getting better value from our investment
Representing Situations in Assessment - Getting better value from our investment
 
Augmented Reality: Beyond Usability
Augmented Reality: Beyond UsabilityAugmented Reality: Beyond Usability
Augmented Reality: Beyond Usability
 
Dissociation of neural networks for anticipation and consumption
Dissociation of neural networks for anticipation and consumptionDissociation of neural networks for anticipation and consumption
Dissociation of neural networks for anticipation and consumption
 

Plus de guru001

Lapointe Ia 260 Using Content Types To Improve Discoverability
Lapointe Ia 260 Using Content Types To Improve DiscoverabilityLapointe Ia 260 Using Content Types To Improve Discoverability
Lapointe Ia 260 Using Content Types To Improve Discoverabilityguru001
 

Plus de guru001 (6)

Banner1
Banner1Banner1
Banner1
 
UCL
UCLUCL
UCL
 
UCL
UCLUCL
UCL
 
UCL
UCLUCL
UCL
 
Lapointe Ia 260 Using Content Types To Improve Discoverability
Lapointe Ia 260 Using Content Types To Improve DiscoverabilityLapointe Ia 260 Using Content Types To Improve Discoverability
Lapointe Ia 260 Using Content Types To Improve Discoverability
 
Banner1
Banner1Banner1
Banner1
 

Dernier

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Dernier (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Marinier Laird Cogsci 2008 Emotionrl Pres

  • 1. Emotion-Driven Reinforcement Learning Bob Marinier & John Laird University of Michigan, Computer Science and Engineering CogSci’08
  • 2. 2 Introduction • Interested in the functional benefits of emotion for a cognitive agent ▫ Appraisal theories of emotion ▫ PEACTIDM theory of cognitive control • Use emotion as a reward signal to a reinforcement learning agent ▫ Demonstrates a functional benefit of emotion ▫ Provides a theory of the origin of intrinsic reward
  • 3. 3 Outline • Background ▫ Integration of emotion and cognition ▫ Integration of emotion and reinforcement learning ▫ Implementation in Soar • Learning task • Results
  • 4. 4 Appraisal Theories of Emotion • A situation is evaluated along a number of appraisal dimensions, many of which relate the situation to current goals ▫ Novelty, goal relevance, goal conduciveness, expectedness, causal agency, etc. • Appraisals influence emotion • Emotion can then be coped with (via internal or external actions) Situation Goals Coping Appraisals Emotion
  • 5. 5 Appraisals to Emotions (Scherer 2001) Joy Fear Anger High/medium High High Suddenness High High High Unpredictability Low Intrinsic pleasantness High High High Goal/need relevance Other/nature Other Cause: agent Chance/intentional Intentional Cause: motive Very high High Very high Outcome probability Discrepancy from High High expectation Very high Low Low Conduciveness High Control Very low High Power
  • 6. 6 Cognitive Control: PEACTIDM (Newell 1990) Perceive Obtain raw perception Encode Create domain-independent representation Attend Choose stimulus to process Comprehend Generate structures that relate stimulus to tasks and can be used to inform behavior Task Perform task maintenance Intend Choose an action, create prediction Decode Decompose action into motor commands Motor Execute motor commands
  • 7. 7 Unification of PEACTIDM and Appraisal Theories Perceive Environmental Raw Perceptual Change Information Motor Encode Suddenness Stimulus Unpredictability Motor Relevance Goal Relevance Commands Intrinsic Pleasantness Prediction Outcome Decode Attend Probability Causal Agent/Motive Action Stimulus chosen Discrepancy for processing Conduciveness Control/Power Intend Comprehend Current Situation Assessment
  • 8. 8 Distinction between emotion, mood, and feeling (Marinier & Laird 2007) • Emotion: Result of appraisals ▫ Is about the current situation • Mood: “Average” over recent emotions ▫ Provides historical context • Feeling: Emotion “+” Mood ▫ What agent actually perceives
  • 9. 10 Intrinsically Motivated Reinforcement Learning (Sutton & Barto 1998; Singh et al. 2004) External Environment Environment Actions Sensations Critic Internal Environment Appraisal Actions Rewards States Critic Process +/- Feeling Decisions Rewards States Intensity Agent Agent “Organism” • Reward = Intensity * Valence
  • 10. 11 Extending Soar with Emotion (Marinier & Laird 2007) Symbolic Long-Term Memories Procedural Episodic Semantic Reinforcement Chunking Episodic Semantic Learning Learning Learning Short-Term Memory Appraisal Detector Decision Procedure Situation, Goals Visual Perception Action Imagery Body
  • 11. 12 Extending Soar with Emotion (Marinier & Laird 2007) Symbolic Long-Term Memories Procedural Episodic Semantic Reinforcement Chunking Episodic Semantic Learning Learning Learning Appraisal Detector Feeling .9,.6,.5,-.1,.8,… Short-Term Memory Decision Feelings Procedure Situation, Goals Emotion Mood .5,.7,0,-.4,.3,… .7,-.2,.8,.3,.6,… Visual Perception Action Imagery Body Knowledge Architecture
  • 13. 14 Learning task: Encoding North Passable: false On path: false Progress: true East West Passable: false Passable: false On path: true On path: false Progress: true Progress: true South Passable: true On path: true Progress: true
  • 14. 15 Learning task: Encoding & Appraisal North Intrinsic Pleasantness: Low Goal Relevance: Low Unpredictability: High East West Intrinsic Pleasantness: Low Intrinsic Pleasantness: Low Goal Relevance: High Goal Relevance: Low Unpredictability: High Unpredictability: High South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low
  • 15. 16 Learning task: Attending, Comprehending & Appraisal South Intrinsic Pleasantness: Neutral Goal Relevance: High Unpredictability: Low Conduciveness: High Control: High …
  • 17. 18 Learning task: Tasking Optimal Subtasks
  • 18. 19 What is being learned? • When to Attend vs Task • If Attending, what to Attend to • If Tasking, which subtask to create • When to Intend vs. Ignore
  • 19. 20 Learning Results 12000 Median Processing Cycles 10000 8000 6000 4000 2000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Episode Standard RL Feeling=Emotion Feeling=Emotion+Mood
  • 20. 21 Results: With and without mood 300 Median Processing Cycles 290 280 270 260 250 240 8 9 10 11 12 13 14 15 Episode Feeling=Emotion Feeling=Emotion+Mood Optimal
  • 21. 22 Discussion • Agent learns both internal (tasking) and external (movement) actions • Emotion allows for more frequent rewards, and thus learns faster than standard RL • Mood “fills in the gaps” allowing for even faster learning and less variability
  • 22. 23 Conclusion & Future Work • Demonstrated computational model that integrates emotion and cognitive control • Confirmed emotion can drive reinforcement learning • We have already successfully demonstrated similar learning in a more complex domain • Would like to explore multi-agent scenarios