Abstract
Reinforcement learning (RL) is a promising direction in automated parking systems (APSs), as integrating planning and tracking control using RL can potentially maximize the overall performance. However, commonly used model-free RL requires many interactions to achieve acceptable performance, and model-based RL in APS cannot continuously learn. In this paper, a data-efficient RL method is constructed to learn from data by use of a model-based method. The proposed method uses a truncated Monte Carlo tree search to evaluate parking states and select moves. Two artificial neural networks are trained to provide the search probability of each tree branch and the final reward for each state using self-trained data. The data efficiency is enhanced by weighting exploration with parking trajectory returns, an adaptive exploration scheme, and experience augmentation with imaginary rollouts. Without human demonstrations, a novel training pipeline is also used to train the initial action guidance network and the state value network. Compared with path planning and path-following methods, the proposed integrated method can flexibly co-ordinate the longitudinal and lateral motion to park a smaller parking space in one maneuver. Its adaptability to changes in the vehicle model is verified by joint Carsim and MATLAB simulation, demonstrating that the algorithm converges within a few iterations. Finally, experiments using a real vehicle platform are used to further verify the effectiveness of the proposed method. Compared with obtaining rewards using simulation, the proposed method achieves a better final parking attitude and success rate.
Highlights
Automated parking systems (APSs) are important due to their great potential to reduce accidents in narrow urban parking spaces and increase parking space use [1,2]
Monte Carlo tree search (MCTS) algorithm is used to implement this method, of which the data efficiency is improved by designing an adaptive exploration encouragement factor and weighted policy learning to enhance the network updating toward the direction of the trajectory with high return
A novel design for a reinforcement learning algorithm composed of Monte Carlo tree search and two neural networks was proposed for data-efficient automatic parking
Summary
Automated parking systems (APSs) are important due to their great potential to reduce accidents in narrow urban parking spaces and increase parking space use [1,2]. Model-free RL has achieved acceptable control performance for APS [8], in which the algorithm learns to steer by directly trying actions in an attempt to attain a maximum accumulative reward This method requires thousands of real-time interactions for applications. A large number of trials were necessary for the obtaining and verification of the vehicle model It cannot continuously learn using the limited number of parking data collected on the controlled object with the unknown model to further improve the ability. MCTS algorithm is used to implement this method, of which the data efficiency is improved by designing an adaptive exploration encouragement factor and weighted policy learning to enhance the network updating toward the direction of the trajectory with high return.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have