Abstract

One of the main challenges arising in a high-speed railway (HSR) is predicting how fast a train, once delayed, can recover its operation. Accurate prediction of delay recovery in the downstream stations of a HSR line can help train dispatchers make adjustments to the timetables and inform the passengers of the expected delay to improve service reliability and increase passenger satisfaction. In this paper, we present the results of an effort to develop data-driven delay recovery prediction models using train operation records from the Centralized Traffic Control system (CTC) of Wuhan-Guangzhou (W-G) HSR in Guangzhou Railway Bureau. We first identified the main variables that contribute to delay, including total dwell (TD) time, running buffer (RB) time, magnitude of primary delay (PD), and individual sections' influence. Two alternative models, namely, multiple linear regression (MLR) and random forest regression (RFR), are calibrated and evaluated. The validation results on test datasets indicate that both models have good performance, with the RFR model outperforming the MLR in terms of prediction accuracy. Specifically, the evaluation results show that when the prediction tolerance is less than 3 minutes, the RFR model can achieve up to 90.9% of prediction accuracy, while this value is 84.4% for MLR model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call