Abstract

Reinforcement learning (RL) is a promising direction in automated parking systems (APSs), as integrating planning and tracking control using RL can potentially maximize the overall performance. However, commonly used model-free RL requires many interactions to achieve acceptable performance, and model-based RL in APS cannot continuously learn. In this paper, a data-efficient RL method is constructed to learn from data by use of a model-based method. The proposed method uses a truncated Monte Carlo tree search to evaluate parking states and select moves. Two artificial neural networks are trained to provide the search probability of each tree branch and the final reward for each state using self-trained data. The data efficiency is enhanced by weighting exploration with parking trajectory returns, an adaptive exploration scheme, and experience augmentation with imaginary rollouts. Without human demonstrations, a novel training pipeline is also used to train the initial action guidance network and the state value network. Compared with path planning and path-following methods, the proposed integrated method can flexibly co-ordinate the longitudinal and lateral motion to park a smaller parking space in one maneuver. Its adaptability to changes in the vehicle model is verified by joint Carsim and MATLAB simulation, demonstrating that the algorithm converges within a few iterations. Finally, experiments using a real vehicle platform are used to further verify the effectiveness of the proposed method. Compared with obtaining rewards using simulation, the proposed method achieves a better final parking attitude and success rate.

Highlights

  • Automated parking systems (APSs) are important due to their great potential to reduce accidents in narrow urban parking spaces and increase parking space use [1,2]

  • Monte Carlo tree search (MCTS) algorithm is used to implement this method, of which the data efficiency is improved by designing an adaptive exploration encouragement factor and weighted policy learning to enhance the network updating toward the direction of the trajectory with high return

  • A novel design for a reinforcement learning algorithm composed of Monte Carlo tree search and two neural networks was proposed for data-efficient automatic parking

Read more

Summary

Introduction

Automated parking systems (APSs) are important due to their great potential to reduce accidents in narrow urban parking spaces and increase parking space use [1,2]. Model-free RL has achieved acceptable control performance for APS [8], in which the algorithm learns to steer by directly trying actions in an attempt to attain a maximum accumulative reward This method requires thousands of real-time interactions for applications. A large number of trials were necessary for the obtaining and verification of the vehicle model It cannot continuously learn using the limited number of parking data collected on the controlled object with the unknown model to further improve the ability. MCTS algorithm is used to implement this method, of which the data efficiency is improved by designing an adaptive exploration encouragement factor and weighted policy learning to enhance the network updating toward the direction of the trajectory with high return.

Environmental Perception
Motion Generation
Structure of Automatic Parking System
Problem Definition of Parking
Data-Efficient RL Algorithm Design
Approximate Modified Policy Iteration
Truncated MCTS Guided by Artificial Neural Networks
Data-Efficient Promotion Methods for RL
Policy Learning by Weighting Exploration with Trajectory Returns
Experience Augmentation with Imagination Rollouts
Warm Start with Pre-Trained RL Model
Simulations
Feasibility of the Learning Algorithm
Model Pre-Trained with the Policy Network and MCTS
Complete Training of RL Model
Comparison with Curve-Based Path Planning Method
Data Efficiency Verification During Adaptability to Changes in Vehicle Model
Real Vehicle Experiments
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call