Nash double Q-based multi-agent deep reinforcement learning for interactive merging strategy in mixed traffic

Lin Li,Wanzhong Zhao,Chunyan Wang,Abbas Fotouhi,Xuze Liu

doi:10.1016/j.eswa.2023.121458

Abstract

The interaction between ramp and mainline vehicles plays a crucial role in merging areas, especially in the mixed-traffic environment. The driving behaviours of human drivers are uncertain and diverse, and the uncertainty makes it more complex for connected automated vehicles (CAV) to plan trajectories and merge into the mainline. To overcome this problem, a interactive merging strategy based on multi-agent deep reinforcement learning (MADRL) is designed, enabling the ramp vehicle (CAV) to consider the dynamic reaction of mainline vehicles. There are two agents in our interactive strategy, one of which is to predict and analyse the behaviour of mainline vehicles (human-driven vehicles, HDV, or non-connected vehicles). The other is created for exploring optimal merging actions of ramp vehicles. Firstly, game theory is used to model the competitive behaviours between ramp and mainline vehicles, and the Nash equilibrium of joint actions guides the ramp vehicle to learn best response to the mainline vehicle. Secondly, the Nash double Q algorithm is developed to ensure the outputs of Q networks are trained to efficiently converge to the Nash equilibrium point. The trained Q networks are then used for online control. Finally, our strategy is compared with single RL and existing MADRL algorithms in real on-ramp scenarios. Simulations show our strategy to be successful in coordinating both vehicles via analysis of human drivers, resulting in improved driving performance in terms of global safety, efficiency, and comfort.

Full Text