This study addresses the multifaceted challenge of ensuring the regularity of bus services, minimizing bus bunching, and facilitating synchronized bus connections across routes. An enhanced multi-agent reinforcement learning algorithm, namely the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, is proposed to implement real-time control strategies for addressing these issues simultaneously. The merit of the modified MADDPG algorithm lies in its ability to continuously learn while adeptly navigating the non-stationary operating nature of bus system networks. A case study of a bus corridor is used to train and test the algorithm. Four robust scenarios, each presenting varying degrees of travel time and dwell time variations, are designed to assess the algorithm’s robustness. Results indicate that the MADDPG algorithm can significantly increase the likelihood of synchronized bus transfers across multiple routes by two or three times while maintaining the service reliability on each route. Moreover, the flexibility of the MADDPG algorithm in training bus policies allows it to effectively adapt to up to 90% variations in bus travel times and demand changes, even amid disruptive events in real-world scenarios.
Read full abstract