Cooperation at unsignalized intersections in mixed traffic environments, where Connected and Autonomous Vehicles (CAVs) and Manually Driving Vehicles (MVs) coexist, holds promise for improving safety, efficiency, and energy savings. However, the mixed traffic at unsignalized intersections present huge challenges like MVs’ uncertainties, the chain reaction and diverse interactions. Following the thought of the situation-aware cooperation, this paper proposes a Reasoning Graph-based Reinforcement Learning (RGRL) method, which integrates a Graph Neural Network (GNN) based policy and an environment providing mixed traffic with uncertain behaviors. Firstly, it graphicly represents the observed scenario as a situation using the interaction graph with connected but uncertain (bi-directional) edges. The situation reasoning process is formulated as a Reasoning Graph-based Markov Decision Process which infers the vehicle sequence stage by stage so as to sequentially depict the entire situation. Then, a GNN-based policy is constructed, which uses Graph Convolution Networks (GCN) to capture the interrelated chain reactions and Graph Attentions Networks (GAT) to measure the attention of diverse interactions. Furthermore, an environment block is developed for training the policy, which provides trajectory generators for both CAVs and MVs. A reward function that considers social compliance, collision avoidance, efficiency and energy savings is also provided in this block. Finally, three Reinforcement Learning methods, D3QN, PPO and SAC, are implemented for comparative tests to explore the applicability and strength of the framework. The test results demonstrate that the D3QN outperformed the other two methods with a larger converged reward while maintaining a similar converged speed. Compared to multi-agent RL (MARL), the RGRL approach showed superior performance statistically, reduced the number of severe conflicts by 77.78–94.12 %. The RGRL reduced average and maximum travel times by 13.62–16.02 %, and fuel-consumption by 3.38–6.98 % in medium or high Market Penetration Rates (MPRs). Hardware-in-the-loop (HIL) and Vehicle-in-the-loop (VehIL) experiments were conducted to validate the model effectiveness.
Read full abstract