Multi-Agent Reinforcement Learning Based on Representational Communication for Large-Scale Traffic Signal Control

Rohit Bokade,Xiaoning Jin,Christopher Amato

doi:10.1109/access.2023.3275883

Rohit Bokade, Xiaoning Jin + Show 1 more

Open Access

https://doi.org/10.1109/access.2023.3275883

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2023
Citations: 6	License type: CC BY 4.0

Affiliation: Northeastern University, Boston University

Abstract

Traffic signal control (TSC) is a challenging problem within intelligent transportation systems and has been tackled using multi-agent reinforcement learning (MARL). While centralized approaches are often infeasible for large-scale TSC problems, decentralized approaches provide scalability but introduce new challenges, such as partial observability. Communication plays a critical role in decentralized MARL, as agents must learn to exchange information using messages to better understand the system and achieve effective coordination. Deep MARL has been used to enable inter-agent communication by learning communication protocols in a differentiable manner. However, many deep MARL communication frameworks proposed for TSC allow agents to communicate with all other agents at all times, which can add to the existing noise in the system and degrade overall performance. In this study, we propose a communication-based MARL framework for large-scale TSC. Our framework allows each agent to learn a communication policy that dictates “which” part of the message is sent “to whom”. In essence, our framework enables agents to selectively choose the recipients of their messages and exchange variable length messages with them. This results in a decentralized and flexible communication mechanism in which agents can effectively use the communication channel only when necessary. We designed two networks, a synthetic <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$4 \times 4$ </tex-math></inline-formula> grid network and a real-world network based on the Pasubio neighborhood in Bologna. Our framework achieved the lowest network congestion compared to related methods, with agents utilizing <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\sim 47-65 \%$ </tex-math></inline-formula> of the communication channel. Ablation studies further demonstrated the effectiveness of the communication policies learned within our framework.

Full Text