Abstract

Decision-making is an important component of autonomous driving perception, decision-making, planning, and control pipeline, which undertakes the task of how the ego vehicle makes high-level decision-making behaviors (such as lane change and car following) after sensing the environmental state, and then these high-level decision-making behaviors can be transmitted to the downstream planning and control module for specific low-level action execution. Based on the method of deep reinforcement learning (specifically, Deep Q network (DQN) and its variants), an integrated lateral and longitudinal decision-making model for autonomous driving is proposed in a multilane highway environment with both autonomous driving vehicle (ADV) and manual driving vehicle (MDV). The classic MOBIL and IDM models are used for the lateral and longitudinal decisions of MDV (i.e., lane changing and car following), while the lateral and longitudinal decisions of ADV are dominated by deep reinforcement learning models. In addition, this paper also uses the nonlinear kinematic bicycle model and two-point visual control model to realize the low-level control of both MDV and ADV. By setting a reasonable state, action, and reward function, this paper has carried out a large number of simulation experiments on the proposed autonomous driving decision-making model based on deep reinforcement learning in a three-lane road environment. The results show that under such scenario setting conditions, the deep reinforcement learning-based model proposed in this paper performs well in autonomous driving safety and travel efficiency. At the same time, when compared with the classical rule-based decision-making model (MOBIL&IDM), it is found that the model proposed in this paper can significantly achieve better results in episode rewards after stable training. In addition, through a large number of hyper-parameter tuning experiments, the performance of DQN, DDQN, and dueling DQN models, which are also deep reinforcement learning-based decision-making models, under different hyper-parametric configurations is compared and analyzed, which can provide a valuable reference for the specific scenario application of these models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call