A novel one-vs-rest consensus learning method for crash severity prediction

Syed Fawad Hussain,Muhammad Mansoor Ashraf

doi:10.1016/j.eswa.2023.120443

Abstract

Research in crash severity prediction is necessary to allow safety planners to take precautionary measures and enable first aiders to remain prepared for assisting the injured. Existing literature in the field of crash severity prediction is mostly focused on generating the attributes for predicting the severity. However, in reality, not all features are discriminating, and certain classes are challenging to detect even employing the entire feature set. Although to tackle these problems several techniques are developed in the Machine Learning (ML) literature. But their application to crash severity prediction and an optimal strategy for the best combination of features or classifiers for achieving high accuracy is a less studied area. To address these problems, this work first provides a comparison of widely used classifiers for predicting crash severity; and secondly, by combining class-wise majority voting with One-vs-Rest (OvR) approach, a novel classification framework named, OvR consensus learning (OvRCL) is proposed. The proposed method avail a feature selection technique, Mutual information (MI), to acquire the most relevant feature set regarding the output class (i.e. severity). Moreover, to differentiate each class in the multi-class data, OvRCL iteratively runs ML algorithms as binary classifiers in an ensemble framework to significantly ameliorate classification performance. In our experiments, we use four classifiers, namely, the k-Nearest Neighbors (k-NN), Support Vector Machines (SVM), Random Forest (RF), and Bagging classifier, to get the consensus. Analysis was done using a real crash dataset obtained from an open data source of Leeds city council. A four-year crash data (2015–2018) is used for training and the OvRCL is tested on the 2019 data. Moreover, to validate the performance of OvRCL, this study also utilizes two more datasets with high-class imbalance. In contrast to conventional ML algorithms, our experiments depictthat the OvRCL is a potent method for forecasting crash severity levels on the data under test.

Full Text