Abstract

Exploring and analyzing safety influencing factors can guide targeted traffic safety management. Traditional traffic safety models are aimed at specific data problems and making adjustments to the model structure, which lack focus on predictive ability and have limited information on the analysis of influencing factors. In recent years, machine-learning methods have opened new avenues in modeling that have higher prediction accuracy, can identify complex nonlinear relationships, and can overcome over- and under-dispersion and correlation. Machine-learning methods, however, pose the problem of limited interpretability. The interpretable machine-learning framework SHAP can be an effective solution, which can not only reflect the influence of features in each sample but also generate global interpretation. This study established gradient boosting models including the CatBoost and XGBoost models as traffic safety models, which were compared with a traditional NB regression model and a zero-inflated negative binomial regression model. SHAP was used to analyze several safety influencing factors, including geometric design features, traffic operation characteristics, time of day, and land use. Results confirmed that the CatBoost model has better prediction ability and is a more suitable traffic safety model than the traditional negative binomial regression model. Among the key findings were that ramp type is the most important factor in freeway crash frequency; curve presence has a great positive impact, while truck proportion has a great negative impact; and traffic volume is highly correlated with truck proportion. These findings can provide theoretical support for safety operation management and targeted improvement measures for freeways.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call