Abstract

Analyzing the relationship between track geometry defect occurrence and substructure condition can provide assistance for track inspection and spot maintenance, which contributes to better train operation quality. This paper develops a data-driven approach to estimate the occurrence of track geometry defects on concrete-tie tracks on one passenger railroad in the United States, using substructure data, rail seat abrasion data, infrastructure data, traffic data, track class information, and maintenance data. Feature extraction was implemented to generate input variables for the machine learning models. Recursive feature elimination (RFE) was applied to reduce data dimensionality by recursively considering smaller sets of features. Three data treatment methods, including no resampling, undersampling, and oversampling, were incorporated to address imbalanced data issues. The developed models included logistic regression, artificial neural network, and gradient boosting. The hyperparameters of the proposed models were optimized using Bayesian optimization. The performance of the proposed methods was finally evaluated based on the test dataset generated using random data partitioning. Based on data collected from one passenger railroad, the gradient boosting method with data oversampling shows the highest performance in estimating the occurrence of geometry defects. The F1-score of the model is 0.662, with G-Mean of 0.738. Feature importance identifies that surfacing, traffic, curvature, switch, and rail replacement are the top five factors influencing the predicted probability of track geometry defect occurrence. The proposed model can be used to prioritize maintenance activities on locations prone to track geometry defects and thus further improve infrastructure safety given budgetary constraints.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call