Railway operations are regularly affected by incidents such as disturbances and disruptions, which cause temporary operational restrictions to the trains in the network. Compared to real-time disturbances and disruptions, sometimes these incidents can be known at a short notice, e.g., 24–48 h beforehand, which is known as the Very-Short-Term-Planning in British rail operations. This paper presents a novel reinforcement learning based approach for rescheduling train services for in a single-track corridor with bi-directional traffic. As an important subfield of machine learning, reinforcement learning offers an alternate strategy for tackling the NP-hard train (re)scheduling problems and shows its advantages in balancing computational efficiency and solution quality. We propose a Q-learning approach with a tiered rewarding strategy and lightweight train representation in state vectors, which enables more efficient learning and knowledge sharing among homogeneous trains. Compared with an existing reinforcement learning approach, our proposed method can find better quality solutions due to its unique representation of state vectors and a novel tiered rewarding/punishing mechanism, overcoming certain disadvantages in existing approaches. Knowledge reusability is another advantage of the proposed approach, as prior knowledge obtained from training one instance can significantly enhance the performance of another, potentially more challenging, instance on the same corridor with substantially reduced computational time and effort on algorithm development. We also discuss the potential applications of the knowledge reusability feature inherent in reinforcement learning algorithms, which we believe will benefit the entire industry in addressing NP-hard problems through data-driven technologies.
Read full abstract