Analysis and Prediction of Severity of United States Countrywide Car Accidents Based on Machine Learning Techniques

Lahiru S Boyagoda,Lakshika S Nawarathna

doi:10.1109/icitr57877.2022.9993371

Abstract

The number of vehicles and road transportation increases rapidly daily. Hence the frequency of road accidents and crashes also gradually increase with it. Analysing traffic accidents is one of the essential concerns in the world. Due to the considerable number of casualties and fatalities caused by those accidents, taking necessary actions to reduce road accidents is a vital public safety concern and challenge worldwide. Various statistical methods and techniques are used to address this issue. Hence, those statistical implementations are used for multiple applications, such as extracting cause and effect to predict real-time accidents. In this study, a United States (US) Countrywide car accidents data set consisting of about 1.5 million accident records with other relevant 45 measurements related to the US Countrywide Traffic Accidents were used. This work aims to develop classification models that predict the likelihood of an accident is severe. In addition, this study also consists of descriptive analysis to recognise the key features affecting the accident severity. Supervised machine learning methods such as Decision tree, K-nearest neighbour, and Random forest were used to create classification models. The predictive model results show that the Random Forest model performs with an accuracy of 83.95% for the train set and 80.69% for the test set, proving that the Random forest model performs better in accurately detecting the most relevant factors describing a road accident severity.

Full Text