Abstract

Recent research works on intelligent traffic signal control (TSC) have been mainly focused on leveraging deep reinforcement learning (DRL) due to its proven capability and performance. DRL-based traffic signal control frameworks belong to either discrete or continuous controls. In discrete control, the DRL agent selects the appropriate traffic light phase from a finite set of phases. Whereas in continuous control approach, the agent decides the appropriate duration for each signal phase within a predetermined sequence of phases. Among the existing works, there are no prior approaches that propose a flexible framework combining both discrete and continuous DRL approaches in controlling traffic signal. Thus, our ultimate objective in this paper is to propose an approach capable of deciding simultaneously the proper phase and its associated duration. Our contribution resides in adapting a hybrid Deep Reinforcement Learning that considers at the same time discrete and continuous decisions. Precisely, we customize a Parameterized Deep Q-Networks (P-DQN) architecture that permits a hierarchical decision-making process that primarily decides the traffic light next phases and secondly specifies its the associated timing. The evaluation results of our approach using Simulation of Urban MObility (SUMO) shows its out-performance over the benchmarks. The proposed framework is able to reduce the average queue length of vehicles and the average travel time by 22.20% and 5.78%, respectively, over the alternative DRL-based TSC systems.

Highlights

  • Traffic congestion is one of the biggest issues in most of today’s cities causing significant delays and subsequent economic losses [1]

  • The problem being tackled is usually formulated as a Markovian Decision Process (MDP) [28], which is characterized by the tuple < S, P, A, R, γ >

  • The resulting smoothed training curves of the proposed framework are illustrated in the Figure 6. It can be noticed from the learning curves that the training undergoes what is known as a ‘’cold start” [34] problem at early stages due to the exploration of the unfamiliar environment where the agent randomly applies decision actions

Read more

Summary

Introduction

Traffic congestion is one of the biggest issues in most of today’s cities causing significant delays and subsequent economic losses [1] To tackle this issue, several research efforts in the transportation field attempted to develop intelligent transportation systems (ITS) aiming to overcome traffic congestion and improve traffic flow. In DRL-based traffic light controllers, the objective of the DRL agent is to decide the optimal action which yields improving the TSCs performance. The DRL agent selects any phase from a finite set of phases without being limited to a predefined sequence of phases [8]. At the time-step t, the agent observes the environment state st ∈ S and selects an action at ∈ A according to its policy π.

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.