Abstract

Reinforcement Learning (RL) has been recognized as one of the most effective methods to optimize traffic signal control. However, due to the inappropriate design of RL elements (i.e., reward and state) for complex traffic dynamics, existing RL-based approaches suffer from slow convergence to optimal traffic signal plans. Meanwhile, to simplify the traffic modeling, most optimization methods assume that the phase duration of traffic signals is constant, which strongly limits the RL capability to search for traffic signal control policies with shorter average vehicle travel time and better GreenWave control. To address these issues, this paper proposes a novel intensity- and phase duration-aware RL-based method named IPDALight for the optimization of traffic signal control. Inspired by the Max Pressure (MP)-based traffic control strategy used in the transportation field, we introduce a new concept named intensity, which ensures that our reward design and state representation can accurately reflect the status of vehicles. By taking the coordination of neighboring intersections into account, our approach enables the fine-tuning of phase duration of traffic signals to adapt to dynamic traffic situations. Comprehensive experimental results on both synthetic and real-world traffic scenarios show that, compared with the state-of-the-art RL methods, IPDALight can not only achieve better average vehicle travel time and greenwave control for various multi-intersection scenarios, but also converge to optimal solutions much faster.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call