Abstract

The classification of vehicular crashes based on their severity is crucial since not all of them have the same financial and injury values. In addition, avoiding crashes by identifying their influential factors is possible via accurate prediction modeling. In crash severity analysis, accurate and time-saving prediction models are necessary for classifying crashes based on their severity. Moreover, statistical models are incapable of identifying the potential severity of crashes regarding influencing factors incorporated in models. Unlike previous research efforts, which focused on the limited class of crash severity, including property damage only (PDO), fatality, and injury by applying data mining models, the present study sought to predict crash frequency according to five severity levels of PDO, fatality, severe injury, other visible injuries, and complaint of pain. The multinomial logistic regression (MLR) model and data mining approaches, including artificial neural network-multilayer perceptron (ANN-MLP) and two decision tree techniques, (i.e., Chi-square automatic interaction detector (CHAID) and C5.0) are utilized based on traffic crash records for State Highways in California, USA. The comparison of the findings of the relative importance of ten qualitative and ten quantitative independent variables incorporated in CHAID and C5.0 indicated that the cause of the crash (X1) and the number of vehicles (X5) were known as the most influential variables involved in the crash. However, the cause of the crash (X1) and weather (X2) were identified as the most contributing variables by the ANN-MLP model. In addition, the MLR model showed that the driver’s age (X11) accounts for a larger proportion of traffic crash severity. Therefore, the sensitivity analysis demonstrated that C5.0 had the best performance for predicting road crash severity. Not only did C5.0 take a shorter time (0.05 s) compared to CHAID, MLP, and MLR, it also represented the highest accuracy rate for the training set. The overall prediction accuracy based on the training data was approximately 88.09% compared to 77.21% and 70.21% for CHAID and MLP models. In general, the findings of this study revealed that C5.0 can be a promising tool for predicting road crash severity.

Highlights

  • This study mainly aimed to investigate five classes of crash severity, including property damage only (PDO), fatality, severe injury, other visible injuries, and complaint of pain based on the highway safety information system (HSIS) data for all state highways in California, the USA in 2012–2014

  • The results indicated that multinomial logistic regression model is appropriate for both non-interstate and interstates crashes involved in traffic barriers

  • The present study considers a comprehensive classification of crash severity such as PDO, fatality, severe injury, other visible injuries, and complaint of pain based on the HSIS

Read more

Summary

Introduction

More than 1.3 million people die worldwide, and as many as 50 million are annually injured in road crashes. According to official statistics by the World Health Organization [1], traffic crashes are projected to be the fifth leading cause of death in the world by 2030. Traffic crashes impose tremendous costs in terms of human casualties, agony, and economic losses on the people and governments worldwide [2,3,4]. The HSIS claims that in California, there were 3898 fatal crashes in 2017, which have increased 34.29% since 2012

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call