Abstract

Research in crash severity prediction is necessary to allow safety planners to take precautionary measures and enable first aiders to remain prepared for assisting the injured. Existing literature in the field of crash severity prediction is mostly focused on generating the attributes for predicting the severity. However, in reality, not all features are discriminating, and certain classes are challenging to detect even employing the entire feature set. Although to tackle these problems several techniques are developed in the Machine Learning (ML) literature. But their application to crash severity prediction and an optimal strategy for the best combination of features or classifiers for achieving high accuracy is a less studied area. To address these problems, this work first provides a comparison of widely used classifiers for predicting crash severity; and secondly, by combining class-wise majority voting with One-vs-Rest (OvR) approach, a novel classification framework named, OvR consensus learning (OvRCL) is proposed. The proposed method avail a feature selection technique, Mutual information (MI), to acquire the most relevant feature set regarding the output class (i.e. severity). Moreover, to differentiate each class in the multi-class data, OvRCL iteratively runs ML algorithms as binary classifiers in an ensemble framework to significantly ameliorate classification performance. In our experiments, we use four classifiers, namely, the k-Nearest Neighbors (k-NN), Support Vector Machines (SVM), Random Forest (RF), and Bagging classifier, to get the consensus. Analysis was done using a real crash dataset obtained from an open data source of Leeds city council. A four-year crash data (2015–2018) is used for training and the OvRCL is tested on the 2019 data. Moreover, to validate the performance of OvRCL, this study also utilizes two more datasets with high-class imbalance. In contrast to conventional ML algorithms, our experiments depictthat the OvRCL is a potent method for forecasting crash severity levels on the data under test.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.