Abstract

Shunting yards are one of the main areas impacting the reliability of rail freight networks, and delayed departures from shunting yards can further also affect the punctuality of mixed-traffic networks. Methods for automatic detection of departures, which are likely to be delayed, can therefore contribute towards increasing the reliability and punctuality of both freight and passenger services. In this paper, we compare the performance of tree-based methods (decision trees and random forests), which have been highly successful in a wide range of generic applications, in classifying the status of (delayed, early, and on-time) departing trains from shunting yards, focusing on the delayed departures as the minority class. We use a total number of 6,243 train connections (representing over 21,000 individual wagon connections) for a one-month period from the Hallsberg yard in Sweden, which is the largest shunting yard in Scandinavia. Considering our dataset, our results show a slight difference between the application of decision trees and random forests in detecting delayed departures as the minority class. To remedy this, enhanced sampling for minority classes is applied by the synthetic minority oversampling technique (SMOTE) to improve detecting and assigning delayed departures. Applying SMOTE improved the sensitivity, precision, and F-measure of delayed departures by 20% for decision trees and by 30% for random forests. Overall, random forests show a relative better performance in detecting all three departure classes before and after applying SMOTE. Although the preliminary results presented in this paper are encouraging, future studies are needed to investigate the computational performance of tree-based algorithms using larger datasets and considering additional predictors.

Highlights

  • Shunting yards are one of the main areas impacting the reliability of rail freight networks, and delayed departures from shunting yards can further affect the punctuality of mixed-traffic networks

  • We compare the performance of tree-based methods, which have been highly successful in a wide range of generic applications, in classifying the status of departing trains from shunting yards, focusing on the delayed departures as the minority class

  • Considering our dataset, our results show a slight difference between the application of decision trees and random forests in detecting delayed departures as the minority class

Read more

Summary

Research Article

The Application of Tree-Based Algorithms on Classifying Shunting Yard Departure Status. We compare the performance of tree-based methods (decision trees and random forests), which have been highly successful in a wide range of generic applications, in classifying the status of (delayed, early, and on-time) departing trains from shunting yards, focusing on the delayed departures as the minority class. Since the scope of this paper is related to departure delay prediction models, a brief overview of the most relevant works is presented below Previous research in this area has mainly focused on the arrival time estimation of passenger trains using data-driven approaches. Oneto et al [17,18,19,20] provided an extensive study of big data analytics implemented in a train delay prediction system for large-scale railway networks with data from Italian railways In their papers, they proposed the application of shallow and deep extreme learning machines for trains’ delays. In Sweden, delayed departures from shunting yards were one of the five main causes of delays due to operator error [26]

Method
Actual departure time Actual departure date Departing train number
On Time
Training Dataset
Results and Discussion
Random forest
Disclosure
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call