In this paper, we examine the practical problem of minimizing the delay in traffic networks that are controlled at each intersection independently, without a centralized supervisory computer and with limited communication bandwidth. We find that existing learning algorithms have lackluster performance or are too computationally complex to be implemented in the field. Instead, we introduce a simple yet efficient and effective approach using multi-agent reinforcement learning (MARL) that applies the Deep Q-Network (DQN) learning algorithm in a fully decentralized setting. First, we decouple the DQN into per-intersection Q-networks and then transmit the output of each Q-network’s hidden layer to its intersection neighbors. We show that our method is computationally efficient compared with other MARL methods, with minimal additional overhead compared with a naive isolated learning approach with no communication. This property enables our method to be implemented in real-world scenarios with less computation power. Finally, we conduct experiments for both synthetic and real-world scenarios and show that our method achieves better performance in minimizing intersection delay than other methods.
Read full abstract