Abstract
This paper proposed a deep deterministic policy gradient (DDPG) algorithm based on the clipped double Q-learning method and long short-term memory (LSTM) neural network for the trajectory tracking control of wheeled mobile robots (WMR). Firstly, aiming at solving the problem of overestimating state-action value in the DDPG algorithm, a double critic network is introduced to approximate the value function, and the minimum value will be taken to calculate the target value of actor network update, avoiding the algorithm falling into local optimal. Then, to overcome the difficulty caused by partial observability in reinforcement learning control, the historical states information will be used to generate the current action by an actor, which adopts an LSTM neural network processing the temporal relations in those historical states. Finally, the reinforcement learning agent based on the improved LSTM-DDPG algorithm is designed as the kinematic controller of WMR and provides reference values for the dynamic controller (two independent PI controllers). Simulation results verify the effectiveness of this control scheme based on the proposed algorithm for WMR trajectory tracking control, even in the presence of external disturbances and model parameter uncertainties.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.