A reinforcement learning-based approach for online bus scheduling

Yingzhuo Liu,Guanqun Ai,Yahong Liu,Xingquan Zuo

doi:10.1016/j.knosys.2023.110584

Abstract

Bus Scheduling Problem (BSP) is vital to save operational cost and ensure service quality. Existing approaches typically generate a bus scheduling scheme in an offline manner and then schedule vehicles according to the scheme. In practice, uncertain events such as traffic congestion occur frequently, which may make the originally planned bus scheduling scheme infeasible. This study proposes a Reinforcement Learning-based Bus Scheduling Approach (RL-BSA) for online bus scheduling. In RL-BSA, each departure time in a bus timetable is regarded as a decision point, and an agent makes a decision at the departure time to select a vehicle to depart at the time. The BSP is modeled as a Markov Decision Process (MDP) for the first time in literature. The state features are devised, which consist of real-time information of vehicles, including remaining working time, remaining driving time, rest time, number of executed trips and vehicle type. A reward function combining a final reward and a step-wise reward is devised. An invalid action masking approach is used to avoid the agent from selecting vehicles not meeting constraints. The agent is trained by interacting with a simulation environment and then the trained agent can schedule vehicles in an online manner. Experiments on real-world BSP instances show that RL-BSA can significantly reduce the number of vehicles used compared with the manual scheduling approach and Adaptive Large Neighborhood Search (ALNS). Under uncertain environment, RL-BSA can cover all departure times in the timetable without increasing the number of vehicles used.

Full Text