Abstract

The model-free reinforcement learning algorithm relieves traffic signal control problem from complex traffic modeling, and is able to learn a reasonable traffic light control policy from virtual simulation. However, the intrinsic characteristic of traffic flow might be helpful for learning a more suitable traffic signal control policy. Hence, in this paper, we investigate the performance of training reinforcement learning agent under different traffic conditions and propose a framework to combine prior traffic knowledge with deep reinforcement learning idea. The proposed network structure contains a simple Softmax classification branch and Q-value network branch, namely Mixed Q-network (MQN), is trained by using Q-learning with memory palace that maintains different replay buffers for different classifications. The comparative deep Q-network (DQN) is trained by using Q-learning with experience replay. Based on the experiments, both DQN and MQN method are able to learn a reasonable traffic signal timing and achieve lower average delay, shorter queue length and less waiting time than fixed time control. Moreover, the MQN could classify the traffic environment and select corresponding reinforcement learning controller at the same time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call