Predicting incident duration using random forests

Khaled Hamad,Rami Al-Ruzouq,Waleed Zeiada,Saleh Abu Dabous,Mohamad Ali Khalil

doi:10.1080/23249935.2020.1733132

Abstract

This paper presents the development of a new model for predicting traffic incident duration using random forests (RFs), a data-driven machine learning technique. Utilizing an extensive dataset with over 140,000 incident records and 52 variables, the developed models were optimized by fine-tuning their parameters. The best-performing RF model achieved a mean absolute error (MAE) of 36.652 min, which is acceptable given the wide range of incident duration considered (1–1,440 min). Another set of models was developed using a short range of 5- to 120-minute incident duration. The performance of the best models for the short range improved significantly, i.e. the MAE decreased to 14.979 min (about a 40% reduction). In comparison, the ANN models developed using the same dataset slightly outperformed (only 0.24%) their RF counterparts; nevertheless, the RF models showed more stable results with a small-error range. Further analysis confirmed that the accuracy of the predictions could be slightly downgraded in return for a substantial reduction in the number of variables utilized.

Full Text