Abstract

The Robot with nonlinear and stochastic dynamic challenges optimal control that relying on an analytical model. Model-free reinforcement learning algorithms have shown their potential in robot learning control without an analytical or statistical dynamic model. However, requiring numerous samples hinders its application. Model-based reinforcement learning that combines dynamic model learning with model predictive control provides promising methods to control the robot with complex dynamics. Robot exploration generates diverse data for dynamic model learning. Model predictive control exploits the approximated model to select an optimal action. There is a dilemma between exploration and exploitation. Uncertainty provides a direction for robot exploring, resulting in better exploration and exploitation trade-off. In this paper, we propose Model Predictive Control with Posterior Sampling (PSMPC) to make the robot learn to control efficiently. Our PSMPC does approximate sampling from the posterior of the dynamic model and applies model predictive control to achieve uncertainty directed exploration. In order to reduce the computational complexity of the resulting controller, we also propose a PSMPC guided policy optimization algorithm. The results of simulation in the high fidelity simulator “MuJoCo” show the effectiveness of our proposed robot learning control scheme.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.