Abstract

AbstractThis research proposes a hybrid approach for predicting incident duration that integrates the salient features of both factorial design of experiments (DOE) and machine learning (ML). This study compares DOE with another widely used technique, forward sequential feature selection (FSFS). Moreover, to confirm the effectiveness and robustness of the proposed approach, multiple ML techniques are employed, including linear regression, decision trees, support vector machines, ensemble trees, Gaussian process regression, and artificial neural networks. The study results are validated using data from the Houston TranStar incidents archive with over 90,000 records. The accuracy of the developed predictive models is compared based on multiple techniques (i.e., no feature selection–ML, FSFS–ML, and DOE–ML). The results revealed that the significant factors affecting incident duration identified by both DOE and FSFS include the type of vehicles involved, type of lanes affected, number of vehicles involved, number of emergency responses dispatched, incident severity level, and day of the week. The comparative results of the different feature selection and modeling approaches revealed that the hybrid DOE–ML approach outperformed the other tested analysis approaches. The best‐performing model under the DOE–ML approach was the SVM with cubic kernel model. It reduced the modeling time by 83.8% while increasing the prediction error by merely 0.02%, which is not significant. Therefore, the prediction accuracy could be slightly downgraded in return for a substantial reduction in the number of variables utilized, resulting in substantial savings in the modeling time and required dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call