SlideShare une entreprise Scribd logo
1  sur  15
Hierarchical POMDP Planning and Execution Joelle Pineau Machine Learning Lunch November 20, 2000
Partially Observable MDP ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],S 1 S 2 S 3
The problem ,[object Object],[object Object]
Proposed Approach ,[object Object],[object Object],[object Object],Act InvestigateHealth Move Navigate CheckPulse AskWhere Left Right Up Down CheckMeds
Hierarchical POMDP Planning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example M B K E 0.1 0.1 0.1 0.1 0.1 0.1 0.8 0.8 POMDP: S o = { M eds,  K itchen,  B edroom} A o  = {ClarifyTask, Check M eds, GoTo K itchen, GoTo B edroom} O o  = {Noise,  M eds,  K itchen,  B edroom} Value Function: MedsState KitchenState BedroomState 0.8 GoToKitchen ClarifyTask GoToBedroom CheckMeds
Hierarchical POMDP Action Partitioning: Act Move CheckMeds ClarifyTask ClarifyTask GoToKitchen GoToBedroom
Local Value Function and Policy -  Move  Controller ClarifyTask GoToKitchen GoToBedroom MedsState KitchenState BedroomState
Modeling Abstract Actions ClarifyTask GoToKitchen GoToBedroom MedsState KitchenState BedroomState Problem :  Need parameters for abstract action  Move Solution :  Use the local policy of corresponding low-level controller General form :  Pr ( s j  | s i , a k abstract  ) = Pr ( s j  | s i , Policy(a k abstract ,s i ) ) Example : Pr ( s j  |  MedsState ,  Move  ) = Pr ( s j  |  MedsState , ClarifyTask ) Policy   (Move,s i ):
Local Value Function and Policy -  Act  Controller Move MedsState KitchenState BedroomState CheckMeds
Comparing Policies Hierarchical Policy: Optimal Policy: = ClarifyTask = CheckMeds = GoToKitchen = GoToBedroom
Bounding the value of the approximation ,[object Object],[object Object],[object Object],[object Object],[object Object]
A real dialogue management example - AskGoWhere - GoToRoom - GoToKitchen - GoToFollow - VerifyRoom - VerifyKitchen - VerifyFollow - GreetGeneral - GreetMorning - GreetNight - RespondThanks - AskWeatherTime - SayCurrent - SayToday - SayTomorrow - StartMeds - NextMeds - ForceMeds - QuitMeds - AskCallWho - CallHelp - CallNurse - CallRelative - VerifyHelp - VerifyNurse - VerifyRelative - AskHealth - OfferHelp - SayTime Act CheckHealth Phone DoMeds CheckWeather Move Greet
Results:
Final words ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Similaire à Hierarchical Pomdp Planning And Execution

Reinfrocement Learning
Reinfrocement LearningReinfrocement Learning
Reinfrocement LearningNatan Katz
 
Cs221 lecture8-fall11
Cs221 lecture8-fall11Cs221 lecture8-fall11
Cs221 lecture8-fall11darwinrlo
 
14_ReinforcementLearning.pptx
14_ReinforcementLearning.pptx14_ReinforcementLearning.pptx
14_ReinforcementLearning.pptxRithikRaj25
 
Hierarchical Reinforcement Learning
Hierarchical Reinforcement LearningHierarchical Reinforcement Learning
Hierarchical Reinforcement Learningahmad bassiouny
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIJack Clark
 
Reinforcement Learning on Mine Sweeper
Reinforcement Learning on Mine SweeperReinforcement Learning on Mine Sweeper
Reinforcement Learning on Mine SweeperDataScienceLab
 
Problem Formulation in Artificial Inteligence Projects
Problem Formulation in Artificial Inteligence ProjectsProblem Formulation in Artificial Inteligence Projects
Problem Formulation in Artificial Inteligence ProjectsDr. C.V. Suresh Babu
 
How to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative waysHow to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative waysYasutoTamura1
 
Counterfactual Learning for Recommendation
Counterfactual Learning for RecommendationCounterfactual Learning for Recommendation
Counterfactual Learning for RecommendationOlivier Jeunen
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learningSubrat Panda, PhD
 
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfanintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfssuseradaf5f
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryRikiya Takahashi
 

Similaire à Hierarchical Pomdp Planning And Execution (20)

Reinfrocement Learning
Reinfrocement LearningReinfrocement Learning
Reinfrocement Learning
 
Cs221 rl
Cs221 rlCs221 rl
Cs221 rl
 
Cs221 lecture8-fall11
Cs221 lecture8-fall11Cs221 lecture8-fall11
Cs221 lecture8-fall11
 
14_ReinforcementLearning.pptx
14_ReinforcementLearning.pptx14_ReinforcementLearning.pptx
14_ReinforcementLearning.pptx
 
Hierarchical Reinforcement Learning
Hierarchical Reinforcement LearningHierarchical Reinforcement Learning
Hierarchical Reinforcement Learning
 
Explore ml day 2
Explore ml day 2Explore ml day 2
Explore ml day 2
 
Introduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement LearningIntroduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement Learning
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
 
Problem space
Problem spaceProblem space
Problem space
 
Problem space
Problem spaceProblem space
Problem space
 
Problem space
Problem spaceProblem space
Problem space
 
PMED Undergraduate Workshop - Introduction to Reinforcement Learning - Lili W...
PMED Undergraduate Workshop - Introduction to Reinforcement Learning - Lili W...PMED Undergraduate Workshop - Introduction to Reinforcement Learning - Lili W...
PMED Undergraduate Workshop - Introduction to Reinforcement Learning - Lili W...
 
AI_Planning.pdf
AI_Planning.pdfAI_Planning.pdf
AI_Planning.pdf
 
Reinforcement Learning on Mine Sweeper
Reinforcement Learning on Mine SweeperReinforcement Learning on Mine Sweeper
Reinforcement Learning on Mine Sweeper
 
Problem Formulation in Artificial Inteligence Projects
Problem Formulation in Artificial Inteligence ProjectsProblem Formulation in Artificial Inteligence Projects
Problem Formulation in Artificial Inteligence Projects
 
How to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative waysHow to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative ways
 
Counterfactual Learning for Recommendation
Counterfactual Learning for RecommendationCounterfactual Learning for Recommendation
Counterfactual Learning for Recommendation
 
An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
 
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfanintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdf
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game Theory
 

Plus de ahmad bassiouny (20)

Work Study & Productivity
Work Study & ProductivityWork Study & Productivity
Work Study & Productivity
 
Motion And Time Study
Motion And Time StudyMotion And Time Study
Motion And Time Study
 
Motion Study
Motion StudyMotion Study
Motion Study
 
The Christmas Story
The Christmas StoryThe Christmas Story
The Christmas Story
 
Turkey Photos
Turkey PhotosTurkey Photos
Turkey Photos
 
Mission Bo Kv3
Mission Bo Kv3Mission Bo Kv3
Mission Bo Kv3
 
Miramar
MiramarMiramar
Miramar
 
Mom
MomMom
Mom
 
Linearization
LinearizationLinearization
Linearization
 
Kblmt B000 Intro Kaizen Based Lean Manufacturing
Kblmt B000 Intro Kaizen Based Lean ManufacturingKblmt B000 Intro Kaizen Based Lean Manufacturing
Kblmt B000 Intro Kaizen Based Lean Manufacturing
 
How To Survive
How To SurviveHow To Survive
How To Survive
 
Dad
DadDad
Dad
 
Ancient Hieroglyphics
Ancient HieroglyphicsAncient Hieroglyphics
Ancient Hieroglyphics
 
Dubai In 2009
Dubai In 2009Dubai In 2009
Dubai In 2009
 
DesignPeopleSystem
DesignPeopleSystemDesignPeopleSystem
DesignPeopleSystem
 
Organizational Behavior
Organizational BehaviorOrganizational Behavior
Organizational Behavior
 
Work Study Workshop
Work Study WorkshopWork Study Workshop
Work Study Workshop
 
Workstudy
WorkstudyWorkstudy
Workstudy
 
Time And Motion Study
Time And  Motion  StudyTime And  Motion  Study
Time And Motion Study
 
_olympic
_olympic_olympic
_olympic
 

Dernier

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 

Dernier (20)

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 

Hierarchical Pomdp Planning And Execution

  • 1. Hierarchical POMDP Planning and Execution Joelle Pineau Machine Learning Lunch November 20, 2000
  • 2.
  • 3.
  • 4.
  • 5.
  • 6. Example M B K E 0.1 0.1 0.1 0.1 0.1 0.1 0.8 0.8 POMDP: S o = { M eds, K itchen, B edroom} A o = {ClarifyTask, Check M eds, GoTo K itchen, GoTo B edroom} O o = {Noise, M eds, K itchen, B edroom} Value Function: MedsState KitchenState BedroomState 0.8 GoToKitchen ClarifyTask GoToBedroom CheckMeds
  • 7. Hierarchical POMDP Action Partitioning: Act Move CheckMeds ClarifyTask ClarifyTask GoToKitchen GoToBedroom
  • 8. Local Value Function and Policy - Move Controller ClarifyTask GoToKitchen GoToBedroom MedsState KitchenState BedroomState
  • 9. Modeling Abstract Actions ClarifyTask GoToKitchen GoToBedroom MedsState KitchenState BedroomState Problem : Need parameters for abstract action Move Solution : Use the local policy of corresponding low-level controller General form : Pr ( s j | s i , a k abstract ) = Pr ( s j | s i , Policy(a k abstract ,s i ) ) Example : Pr ( s j | MedsState , Move ) = Pr ( s j | MedsState , ClarifyTask ) Policy (Move,s i ):
  • 10. Local Value Function and Policy - Act Controller Move MedsState KitchenState BedroomState CheckMeds
  • 11. Comparing Policies Hierarchical Policy: Optimal Policy: = ClarifyTask = CheckMeds = GoToKitchen = GoToBedroom
  • 12.
  • 13. A real dialogue management example - AskGoWhere - GoToRoom - GoToKitchen - GoToFollow - VerifyRoom - VerifyKitchen - VerifyFollow - GreetGeneral - GreetMorning - GreetNight - RespondThanks - AskWeatherTime - SayCurrent - SayToday - SayTomorrow - StartMeds - NextMeds - ForceMeds - QuitMeds - AskCallWho - CallHelp - CallNurse - CallRelative - VerifyHelp - VerifyNurse - VerifyRelative - AskHealth - OfferHelp - SayTime Act CheckHealth Phone DoMeds CheckWeather Move Greet
  • 15.

Notes de l'éditeur

  1. Talk to you about my recent work on ...