Abstract

Highway-rail grade crossing (HRGC) crashes continue to be the major contributors to rail causalities in the United States and have been intensively researched in the past. Data-mining models focus on prediction while dominant general linear models focus on model and data fitness. Decision makers and traffic engineers rely on prediction models to examine at-grade crash frequency and make safety improvement. The gradient boosting (GB) model has gained popularity in many research areas. In this study, to fully understand the model performance on HRGC accident prediction performance, the GB model with functional gradient descent algorithm is selected to analyze crashes at highway-rail grade crossings (HRGCs) and to identify contributor factors. Moreover, contributors’ importance and partial-dependent relations are generated to further understand the relationship of identified contributors and HRGC crash likelihood to concur “black box” issues that most machine learning methods face. Furthermore, to fully demonstrate the model’s prediction performance, a comprehensive model prediction power assessment based on six measurements is conducted, and the prediction performance of the GB model is verified and compared with a decision tree model as a reference due to their popularity and comparable data availability. It is demonstrated that the GB model produces better prediction accuracy and reveals nonlinear relationships among contributors and crash likelihood. In general, HRGC crash likelihood is significantly impacted by several traffic exposure factors: highway traffic volume, railway traffic volume, and train travel speed and others.

Highlights

  • Crashes between motor vehicles and trains at highway-rail grade crossings (HRGCs) often have severe consequences [1]

  • Of all crashes at HRGCs in the U.S (2000 to 2014), 12% resulted in fatalities [2]

  • As indicated by Lu and Tolliver [5] and Oh et al [6], HRGC crash data often show underdispersion distribution where sample variance is less than the sample mean, and less common generalized linear models (GLMs) are Journal of Advanced Transportation suitable for such datasets

Read more

Summary

Introduction

Crashes between motor vehicles and trains at highway-rail grade crossings (HRGCs) often have severe consequences [1]. Numerous models have been developed to identify major contributing factors and explore relationships between crashes and explanatory variables to better understand safety performance and be able to apply effective countermeasures to reduce crash rates at HRGCSs. Since crash data have random, discrete, and nonnegative characteristics, generalized linear models (GLMs) [3] have been commonly selected to investigate the relationship between crashes and contributing factors. To fully demonstrate the model application and its capabilities to analyze safety data, a robust datamining technique, the gradient boosting (GB) model is selected to analyze crashes at HRGCs. Unlike GLMs, it requires no predefined underlying relationship between dependent and independent variables. Us, underdispersed HRGC data are not an issue. To better understand the model forecasting performance, a comprehensive model forecasting accuracy evaluation system including six measurements is proposed and evaluated

Literature Review
Methodology
Findings
Research Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call