An Accident Prediction Model Based on ARIMA in Kuala Lumpur, Malaysia, Using Time Series of Actual Accidents and Related Data

Boon Chong Choo,Mohd Zahirasri Mohd Tohir,Syafiie Syam,Musab Abdul Razak,Dayang Radiah Awang Biak

doi:10.47836/pjst.32.3.07

Abstract

Recently, there has been an emerging trend to analyse time series data and utilise sophisticated tools for optimally fitting time series models. To date, Malaysian industrial accident data is underutilised and lacks informative records. Thus, this paper aims to investigate the Malaysian accident database and further evaluate the optimal forecasting models in accident prediction. The model’s input was based on available data from the Department of Occupational Safety and Health, Malaysia (DOSH), from 2018 until 2021, with 80% of the dataset to train the models and the remaining 20% for validation. The negative binomial and Poisson distribution prediction showed a mean absolute percentage error (MAPE) of 33% and 51%, respectively. It indicated that the negative binomial performed better than the Poisson distribution in accident frequency prediction. The available time series accident data were gathered for four years, and stationarity was checked in R Studio software for the Augmented Dickey-Fuller test. The lowest Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and other error values were used to justify the best model, which was the ARIMA(2,0,2)(2,0,0)(12) model. The ARIMA models were considered after the data showed autocorrelation. The MAPE for both ARIMA in R and manual time series were 40% and 49%, respectively. Therefore, the accident prediction by using R Studio would outperform the manually negative binomial and Poisson distribution. Based on the findings, industrial safety practitioners should report accidents to DOSH truthfully in the era of digitalisation. It could enable future data-driven accident predictions to be carried out.

Full Text