16. Deep Reinforcement Learning that Matters
• ICML2017 reproducibility work shop Reproducibility of
Benchmarked Deep Reinforcement Learning Tasks for Continuous Control
• AAAI2018 accepted
•
–
–
•
•
16
17. Deep Reinforcement Learning that Matters
•
– ACKTR (Wu et al. 2017)
– PPO (Schulman et al. 2017)
– DDPG (Lillicrap et al. 2015)
– TRPO (Schulman et al. 2015)
• ACKTR, PPO
• DDPG, TRPO baseline
•
17
18. Deep Reinforcement Learning that Matters
• Network Architecture
• Reward Scale
• Random Seeds and Trials
• Environments
• Codebases
• Reporting Evaluation Metrics
18
19. Deep Reinforcement Learning that Matters
• Network Architecture
• Reward Scale
• Random Seeds and Trials
• Environments
• Codebases
• Reporting Evaluation Metrics
19
外因的なもの
20. Deep Reinforcement Learning that Matters
• Network Architecture
• Reward Scale
• Random Seeds and Trials
• Environments
• Codebases
• Reporting Evaluation Metrics
20
内因的なもの
44. Deep Reinforcement Learning that Matters
•
•
–
–
–
–
•
– hyperparameters agnostic algorithm
• “There is often no clear winner among all benchmark environments.”
44