Abstract
As a complex nonlinear system, the inverted pendulum (IP) system has the characteristics of asymmetry and instability. In this paper, the IP system is controlled by a learned deep neural network (DNN) that directly maps the system states to control commands in an end-to-end style. On the basis of deep reinforcement learning (DRL), the detail reward function (DRF) is designed to guide the DNN learning control strategy, which greatly enhances the pertinence and flexibility of the control. Moreover, a two-phase learning protocol (offline learning phase and online learning phase) is proposed to solve the “real gap” problem of the IP system. Firstly, the DNN learns the offline control strategy based on a simplified IP dynamic model and DRF. Then, a security controller is designed and used on the IP platform to optimize the DNN online. The experimental results demonstrate that the DNN has good robustness to model errors after secondary learning on the platform. When the length of the pendulum is reduced by 25% or increased by 25%, the steady-state error of the pendulum angle is less than 0.05 rad. The error is within the allowable range. The DNN is robust to changes in the length of the pendulum. The DRF and the two-phase learning protocol improve the adaptability of the controller to the complex and variable characteristics of the real platform and provide reference for other learning-based robot control problems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.