Abstract

Conventional traffic crash analysis methods often use highly aggregated data, making it difficult to understand the effects of time-varying factors on crash occurrence. In this study, the combined effect of roadway geometry, speed distribution, and weather conditions on crash occurrence and severity was investigated on short duration daily level crash data. This study collected data from four different sources on rural two-lane roadways in Texas. A machine learning method, XGBoost (eXtreme Gradient Boosting), was applied to train the data. To mitigate imbalanced data problems, a synthetic minority oversampling technique (SMOTE) was applied. The XGBoost model was trained separately on all crash occurrences and severe crash occurrences. Finally, an explainable artificial intelligence (AI) technique, SHAP (SHapley Additive exPlanation), was applied to investigate the contribution of all variables to the model’s output. The results show that annual average daily traffic has a significant impact on all crash occurrences and severe crash (fatal and incapacitating injury) occurrences on rural two-lane roadways. Moreover, weather condition factors including daily precipitation, average visibility, and the standard deviation of visibility show association with high crash occurrences. The short duration crash prediction models of this study can provide more insights into the relationships between crash, geometric variables, traffic exposure, weather, and operating speed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call