Model-based Reinforcement Learning with Neural Networks on Hierarchical Dynamic System

•

1 like•985 views

This document discusses model-based reinforcement learning using neural networks for hierarchical dynamic systems. It proposes using stochastic neural networks to model subsystem dynamics and handle uncertainty. Stochastic differential dynamic programming is also introduced to deal with simulation biases from learned models. Experiments show deep neural networks with differential dynamic programming worked better than other methods for learning a pouring task with a robot.

Technology

Model-based Reinforcement Learning
with Neural Networks
on Hierarchical Dynamic System
Akihiko Yamaguchi and Christopher G. Atkeson
Robotics Institute, Carnegie Mellon University http://akihikoy.net/

http://reflectionsintheword.files.wordpress.com/
2012/08/pouring-water-into-glass.jpg
http://schools.graniteschools.org/
edtech-canderson/files/2013/01/
heinz-ketchup-old-bottle.jpg
http://old.post-gazette.com/images2/
20021213hosqueeze_230.jpg
http://img.diytrade.com/cdimg/1352823/17809917/
0/1292834033/shampoo_bottle_bodywash_bottle.jpg
http://www.nescafe.com/
upload/golden_roast_f_711.png
My pizza demonstration https://youtu.be/Wgj32blPGiE

Pouring: A Manipulation of Deformable Object
Planning actions
Planning parameters of actions
= Dynamic Programming (Opt ctrl, MPC, …)
Dynamics are partially unknown
 Reinforcement Learning Problem
RL in pouring
Adaptation: not much hard
Generalization: hard
Is Deep NN useful in this problem? (How to use in RL framework?)4

Remarks of Reinforcement Learning
Good to think about Model-free RL v.s.
Model-based RL
Successful robot-learning RL is model-free
(direct policy search) [cf. Kober et al. 2013]
Good at fine-tuning, Less computation cost (at
execution)
Robust to PoMDP
Model-based: Simulation biases
Model-based:
1. Generalization ability
2. Sharable / Reusable
3. Capable to reward changes
2 and 3: Thanks to symbolic (hierarchical)
representation
5
input
output
hidden
－ u
update
FK ANN
[Magtanong et al. 2012]

How to deal with simulation biases?
Do not learn dx/dt = F(x,u) (dt: small like xx ms)
Learn (sub)task-level dynamics
Parameters  F_grasp  Grasp result
Parameters  F_flow_ctrl  Flow ctrl result
Use stochastic models
Gaussian  F  Gaussian
Stochastic Neural Networks [Yamaguchi, Atkeson, ICRA 2016]
Use stochastic dynamic programming
Stochastic Differential Dynamic Programming
[Yamaguchi, Atkeson, Humanoids 2015]
6 Model-based RL with Neural Networks for Hierarchical Dynamic System

Stochastic Neural Networks
Propagation of probability distribution from input to output
Gradients of output expectation w.r.t. an input
Difficulty: Nonlinear activation functions
ReLU (f(x)=max(0,x))
7
Mean
model
Error
model
Input
(shared)

Use Case
8 Independent neural networks for each (sub)dynamical system

Stochastic Differential Dynamic Programming
9

Results of Experiments
DNN+DDP was better
than LWR+DDP
Using redundant
features did not affect
the learning
performance
Worked in pouring
with PR2 robot
10
Video: https://youtu.be/aM3hE1J5W98

More Information
http://akihikoy.net/
https://www.youtube.com/AkihikoYamaguchi
Akihiko Yamaguchi and Christopher G. Atkeson:
Neural Networks and Differential Dynamic Programming for Reinforcement
Learning Problems, in Proceedings of the 2016 IEEE International Conference on
Robotics and Automation (ICRA2016), Stockholm, Sweden, May, 2016.
https://www.researchgate.net/publication/294729454
Akihiko Yamaguchi and Christopher G. Atkeson:
Differential Dynamic Programming with Temporally Decomposed Dynamics, in
Proceedings of the 15th IEEE-RAS International Conference on Humanoid Robots
(Humanoids2015), pp. 696-703, Seoul, 2015.
https://www.researchgate.net/publication/282157952
Akihiko Yamaguchi, Christopher G. Atkeson, and Tsukasa Ogasawara:
Pouring Skills with Planning and Learning Modeled from Human Demonstrations,
International Journal of Humanoid Robotics, Vol.12, No.3, pp.1550030, July, 2015.
https://www.researchgate.net/publication/280733055
11

Similar to Model-based Reinforcement Learning with Neural Networks on Hierarchical Dynamic System

IRJET- Automated Attendance System using Face Recognition

IRJET Journal

Data driven model optimization [autosaved]

Russell Jarvis

IRJET- Sketch-Verse: Sketch Image Inversion using DCNN

IRJET Journal

Image classification with Deep Neural Networks

Yogendra Tamang

Crocodile Physics

Abdullah al-kharusi

imageclassification-160206090009.pdf

KammetaJoshna

final ppt

abknayam

Deep Learning

Jun Wang

Artificial Neural Network (ANN) is a fast-growing method which has been used in different industries during recent years. The main idea for creating ANN which is a subset of artificial intelligence is to provide a simple model of human brain in order to solve complex scientific and industrial problems. ANNs are high-value and low-cost tools in modelling, simulation, control, condition monitoring, sensor validation and fault diagnosis of different systems. It have high flexibility and robustness in modeling, simulating and diagnosing the behavior of rotating machines even in the presence of inaccurate input data. They can provide high computational speed for complicated tasks that require rapid response such as real-time processing of several simultaneous signals. ANNs can also be used to improve efficiency and productivity of energy in rotating equipment

40120140507006

IAEME Publication

40120140507006

IAEME Publication

Automated LiveMigration of VMs

Akhila Chatlapalle

Online learning techniques, such as Stochastic Gradient Descent (SGD), are powerful when applied to risk minimization and convex games on large problems. However, their sequential design prevents them from taking advantage of newer distributed frameworks such as Hadoop/MapReduce. In this session, we will take a look at how we parallelize parameter estimation for linear models on the next-gen YARN framework Iterative Reduce and the parallel machine learning library Metronome. We also take a look at non-linear modeling with the introduction of parallel neural network training in Metronome as well.

MLConf 2013: Metronome and Parallel Iterative Algorithms on YARN

Josh Patterson

Artificial Neural Network Based Graphical User Interface for Estimation of Fa...

ijsrd.com

This paper addresses the problem of estimation of fabrication time in Rig construction projects through application of Artificial Neural Network (ANNs) as this is the most crucial activity for successful project management planning. ANN is a non-linear, data driven, self adaptive approach as opposed to the traditional model based methods, also fast becoming popular in forecasting where relationship between input and output is not known but vast collection of data is available. Around 960 data regarding fabrication activity has been collected from ABG Shipyard Ltd., Dahej. 3 input parameters have been considered for estimation of output as fabrication time. 11 Feed Forward Back Propagation neural networks with different network architectures were made. Network N10 was able to predict the output with MSE 1.35337e-2. Coding was done for the Graphical User Interface (GUI) so that the GUI runs, simulates network N10, and displays the fabrication time for different combination of inputs.

Artificial Neural Network Based Graphical User Interface for Estimation of Fa...

ijsrd.com

resume_Yuli_Liang

Yuli Liang

IRJET- Prediction of Anomalous Activities in a Video

IRJET Journal

Simulators of real-world IT systems are gaining popularity today. However, as it often happens in the early stages of technological readiness, the same term can be understood as different things - from visualisation systems to multi-level multi-agent models. The critical feature of the simulation technology is the degree of trust, or proximity of resemblance of their behaviour to the objects of simulation from the real world. The article presents for the first time an overview of a hybrid approach to modelling Storage attached networks (SAN), in which the parameters of an approximate simulator are dynamically adjusted using machine learning methods, i.e. reinforcement learning. Particular attention is paid to the analysis of the strengths and weaknesses of the existing approaches of simulation and comparison the hybrid approach presented in the article

HYBRID APPROACH TO DESIGN OF STORAGE ATTACHED NETWORK SIMULATION SYSTEMS

IAEME Publication

Survey on Artificial Neural Network Learning Technique Algorithms

IRJET Journal

This paper reports results of artificial neural network for robot navigation tasks. Machine learning methods have proven usability in many complex problems concerning mobile robots control. In particular we deal with the well-known strategy of navigating by “wall-following”. In this study, probabilistic neural network (PNN) structure was used for robot navigation tasks. The PNN result was compared with the results of the Logistic Perceptron, Multilayer Perceptron, Mixture of Experts and Elman neural networks and the results of the previous studies reported focusing on robot navigation tasks and using same dataset. It was observed the PNN is the best classification accuracy with 99,635% accuracy using same dataset.

Learning of robot navigation tasks by

csandit

LEARNING OF ROBOT NAVIGATION TASKS BY PROBABILISTIC NEURAL NETWORK

csandit

Similar to Model-based Reinforcement Learning with Neural Networks on Hierarchical Dynamic System (20)

IRJET- Automated Attendance System using Face Recognition

Data driven model optimization [autosaved]

IRJET- Sketch-Verse: Sketch Image Inversion using DCNN

Image classification with Deep Neural Networks

Crocodile Physics

imageclassification-160206090009.pdf

final ppt

Deep Learning

40120140507006

Automated LiveMigration of VMs

MLConf 2013: Metronome and Parallel Iterative Algorithms on YARN

Artificial Neural Network Based Graphical User Interface for Estimation of Fa...

resume_Yuli_Liang

IRJET- Prediction of Anomalous Activities in a Video

HYBRID APPROACH TO DESIGN OF STORAGE ATTACHED NETWORK SIMULATION SYSTEMS

Survey on Artificial Neural Network Learning Technique Algorithms

Learning of robot navigation tasks by

LEARNING OF ROBOT NAVIGATION TASKS BY PROBABILISTIC NEURAL NETWORK

Recently uploaded

In this session, we will delve into strategic approaches for optimizing knowledge management within Microsoft 365, amidst the evolving landscape of Copilot. From leveraging automatic metadata classification and permission governance with SharePoint Premium, to unlocking Viva Engage for the cultivation of knowledge and communities, you will gain actionable insights to bolster your organization's knowledge-sharing initiatives. In this session, we will also explore how to facilitate solutions to enable your employees to find answers and expertise within Microsoft 365. You will leave equipped with practical techniques and a deeper understanding of how there is more to effective knowledge management than just enabling Copilot, but building actual solutions to prepare the knowledge that Copilot and your employees can use.

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Drew Madelung

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

Real Time Object Detection Using Open CV

Khem

GenCyber Cyber Security Day Presentation

Michael W. Hawkins

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

This presentation explores the impact of HTML injection attacks on web applications, detailing how attackers exploit vulnerabilities to inject malicious code into web pages. Learn about the potential consequences of such attacks and discover effective mitigation strategies to protect your web applications from HTML injection vulnerabilities. for more information visit https://bostoninstituteofanalytics.org/category/cyber-security-ethical-hacking/

HTML Injection Attacks: Impact and Mitigation Strategies

Boston Institute of Analytics

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Partners Life - Insurer Innovation Award 2024

The Digital Insurer

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Data Cloud, More than a CDP by Matt Robison

Anna Loughnan Colquhoun

A Domino Admins Adventures (Engage 2024)

Gabriella Davis

What are drone anti-jamming systems? The drone anti-jamming systems and anti-spoof technology protect against interference, jamming, and spoofing of the UAVs. To protect their security, countries are beginning to research drone anti-jamming systems, also known as drone strike weapons. The anti-jam and anti-spoof technology protects against interference, jamming and spoofing. A drone strike weapon is a drone attack weapon that can attack and destroy enemy drones. So what is so unique about this amazing system?

What Are The Drone Anti-jamming Systems Technology?

Antenna Manufacturer Coco

This presentations targets students or working professionals. You may know Google for search, YouTube, Android, Chrome, and Gmail, but did you know Google has many developer tools, platforms & APIs? This comprehensive yet still high-level overview outlines the most impactful tools for where to run your code, store & analyze your data. It will also inspire you as to what's possible. This talk is 50 minutes in length.

Powerful Google developer tools for immediate impact! (2023-24 C)

wesley chun

🐬 The future of MySQL is Postgres 🐘

RTylerCroy

Imagine a world where information flows as swiftly as thought itself, making decision-making as fluid as the data driving it. Every moment is critical, and the right tools can significantly boost your organization’s performance. The power of real-time data automation through FME can turn this vision into reality. Aimed at professionals eager to leverage real-time data for enhanced decision-making and efficiency, this webinar will cover the essentials of real-time data and its significance. We’ll explore: FME’s role in real-time event processing, from data intake and analysis to transformation and reporting An overview of leveraging streams vs. automations FME’s impact across various industries highlighted by real-life case studies Live demonstrations on setting up FME workflows for real-time data Practical advice on getting started, best practices, and tips for effective implementation Join us to enhance your skills in real-time data automation with FME, and take your operational capabilities to the next level.

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Safe Software

MySQL Webinar, presented on the 25th of April, 2024. Summary: MySQL solutions enable the deployment of diverse Database Architectures tailored to specific needs, including High Availability, Disaster Recovery, and Read Scale-Out. With MySQL Shell's AdminAPI, administrators can seamlessly set up, manage, and monitor these solutions, ensuring efficiency and ease of use in their administration. MySQL Router, on the other hand, provides transparent routing from the application traffic to the backend servers in the architectures, requiring minimal configuration. Completely built in-house and supported by Oracle, these solutions have been adopted by enterprises of all sizes for their business-critical applications. In this presentation, we'll delve into various database architecture solutions to help you choose the right one based on your business requirements. Focusing on technical details and the latest features to maximize the potential of these solutions.

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Miguel Araújo

Scaling API-first – The story of a global engineering organization Ian Reasor, Senior Computer Scientist - Adobe Radu Cotescu, Senior Computer Scientist - Adobe Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

apidays

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

The presentation explores the development and application of artificial intelligence (AI) from its inception to its current status in the modern world. The term "artificial intelligence" was first coined by John McCarthy in 1956 to describe efforts to develop computer programs capable of performing tasks that typically require human intelligence. This concept was first introduced at a conference held at Dartmouth College, where programs demonstrated capabilities such as playing chess, proving theorems, and interpreting texts. In the early stages, Alan Turing contributed to the field by defining intelligence as the ability of a being to respond to certain questions intelligently, proposing what is now known as the Turing Test to evaluate the presence of intelligent behavior in machines. As the decades progressed, AI evolved significantly. The 1980s focused on machine learning, teaching computers to learn from data, leading to the development of models that could improve their performance based on their experiences. The 1990s and 2000s saw further advances in algorithms and computational power, which allowed for more sophisticated data analysis techniques, including data mining. By the 2010s, the proliferation of big data and the refinement of deep learning techniques enabled AI to become mainstream. Notable milestones included the success of Google's AlphaGo and advancements in autonomous vehicles by companies like Tesla and Waymo. A major theme of the presentation is the application of generative AI, which has been used for tasks such as natural language text generation, translation, and question answering. Generative AI uses large datasets to train models that can then produce new, coherent pieces of text or other media. The presentation also discusses the ethical implications and the need for regulation in AI, highlighting issues such as privacy, bias, and the potential for misuse. These concerns have prompted calls for comprehensive regulations to ensure the safe and equitable use of AI technologies. Artificial intelligence has also played a significant role in healthcare, particularly highlighted during the COVID-19 pandemic, where it was used in drug discovery, vaccine development, and analyzing the spread of the virus. The capabilities of AI in healthcare are vast, ranging from medical diagnostics to personalized medicine, demonstrating the technology's potential to revolutionize fields beyond just technical or consumer applications. In conclusion, AI continues to be a rapidly evolving field with significant implications for various aspects of society. The development from theoretical concepts to real-world applications illustrates both the potential benefits and the challenges that come with integrating advanced technologies into everyday life. The ongoing discussion about AI ethics and regulation underscores the importance of managing these technologies responsibly to maximize their their benefits while minimizing potential harms.

Artificial Intelligence: Facts and Myths

Joaquim Jorge

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Axa Assurance Maroc - Insurer Innovation Award 2024

Real Time Object Detection Using Open CV

GenCyber Cyber Security Day Presentation

GenAI Risks & Security Meetup 01052024.pdf

How to Troubleshoot Apps for the Modern Connected Worker

HTML Injection Attacks: Impact and Mitigation Strategies

How to Troubleshoot Apps for the Modern Connected Worker

Partners Life - Insurer Innovation Award 2024

AWS Community Day CPH - Three problems of Terraform

Data Cloud, More than a CDP by Matt Robison

A Domino Admins Adventures (Engage 2024)

What Are The Drone Anti-jamming Systems Technology?

Powerful Google developer tools for immediate impact! (2023-24 C)

🐬 The future of MySQL is Postgres 🐘

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Artificial Intelligence: Facts and Myths

Model-based Reinforcement Learning with Neural Networks on Hierarchical Dynamic System

1. Model-based Reinforcement Learning with Neural Networks on Hierarchical Dynamic System Akihiko Yamaguchi and Christopher G. Atkeson Robotics Institute, Carnegie Mellon University http://akihikoy.net/

2. http://reflectionsintheword.files.wordpress.com/ 2012/08/pouring-water-into-glass.jpg http://schools.graniteschools.org/ edtech-canderson/files/2013/01/ heinz-ketchup-old-bottle.jpg http://old.post-gazette.com/images2/ 20021213hosqueeze_230.jpg http://img.diytrade.com/cdimg/1352823/17809917/ 0/1292834033/shampoo_bottle_bodywash_bottle.jpg http://www.nescafe.com/ upload/golden_roast_f_711.png My pizza demonstration https://youtu.be/Wgj32blPGiE

3. https://youtu.be/GjwfbOur3CQ

4. Pouring: A Manipulation of Deformable Object Planning actions Planning parameters of actions = Dynamic Programming (Opt ctrl, MPC, …) Dynamics are partially unknown  Reinforcement Learning Problem RL in pouring Adaptation: not much hard Generalization: hard Is Deep NN useful in this problem? (How to use in RL framework?)4

5. Remarks of Reinforcement Learning Good to think about Model-free RL v.s. Model-based RL Successful robot-learning RL is model-free (direct policy search) [cf. Kober et al. 2013] Good at fine-tuning, Less computation cost (at execution) Robust to PoMDP Model-based: Simulation biases Model-based: 1. Generalization ability 2. Sharable / Reusable 3. Capable to reward changes 2 and 3: Thanks to symbolic (hierarchical) representation 5 input output hidden － u update FK ANN [Magtanong et al. 2012]

6. How to deal with simulation biases? Do not learn dx/dt = F(x,u) (dt: small like xx ms) Learn (sub)task-level dynamics Parameters  F_grasp  Grasp result Parameters  F_flow_ctrl  Flow ctrl result Use stochastic models Gaussian  F  Gaussian Stochastic Neural Networks [Yamaguchi, Atkeson, ICRA 2016] Use stochastic dynamic programming Stochastic Differential Dynamic Programming [Yamaguchi, Atkeson, Humanoids 2015] 6 Model-based RL with Neural Networks for Hierarchical Dynamic System

7. Stochastic Neural Networks Propagation of probability distribution from input to output Gradients of output expectation w.r.t. an input Difficulty: Nonlinear activation functions ReLU (f(x)=max(0,x)) 7 Mean model Error model Input (shared)

8. Use Case 8 Independent neural networks for each (sub)dynamical system

9. Stochastic Differential Dynamic Programming 9

10. Results of Experiments DNN+DDP was better than LWR+DDP Using redundant features did not affect the learning performance Worked in pouring with PR2 robot 10 Video: https://youtu.be/aM3hE1J5W98

11. More Information http://akihikoy.net/ https://www.youtube.com/AkihikoYamaguchi Akihiko Yamaguchi and Christopher G. Atkeson: Neural Networks and Differential Dynamic Programming for Reinforcement Learning Problems, in Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA2016), Stockholm, Sweden, May, 2016. https://www.researchgate.net/publication/294729454 Akihiko Yamaguchi and Christopher G. Atkeson: Differential Dynamic Programming with Temporally Decomposed Dynamics, in Proceedings of the 15th IEEE-RAS International Conference on Humanoid Robots (Humanoids2015), pp. 696-703, Seoul, 2015. https://www.researchgate.net/publication/282157952 Akihiko Yamaguchi, Christopher G. Atkeson, and Tsukasa Ogasawara: Pouring Skills with Planning and Learning Modeled from Human Demonstrations, International Journal of Humanoid Robotics, Vol.12, No.3, pp.1550030, July, 2015. https://www.researchgate.net/publication/280733055 11

Model-based Reinforcement Learning with Neural Networks on Hierarchical Dynamic System

Recommended

Recommended

More Related Content

Similar to Model-based Reinforcement Learning with Neural Networks on Hierarchical Dynamic System

Similar to Model-based Reinforcement Learning with Neural Networks on Hierarchical Dynamic System (20)

Recently uploaded

Recently uploaded (20)

Model-based Reinforcement Learning with Neural Networks on Hierarchical Dynamic System