Abstract

Addressing the global challenge of traffic crashes necessitates transcending traditional statistical models, which often fail to fully capture the interactions between factors causing crashes. This oversight restricts the predictive accuracy and adaptability of current methodologies. Additionally, there is a notable gap in research that examines the links between behavior-cause relationships and crash injury severity. Our study deploys Natural Language Processing (NLP) and Frequent Pattern (FP) growth algorithm to mine crash narratives for behavior-cause connections, combines with the predictive strength of eXtreme Gradient Boosting (XGBoost) and the interpretative clarity offered by SHapley Additive exPlanations (SHAP), our approach not only predicts crash injury severity with satisfactory precision but also explains the influence of specific behavior-cause and environment conditions on crash outcomes. The integration of NLP and XGBoost, complemented by SHAP insights, has shown promising results with an accuracy of 0.79, outperforming traditional discrete choice models and competes closely with other machine learning approaches, including Support Vector Machines, Random Forest, Categorical Boosting (CatBoost), and Light Gradient Boosting Machine (LightGBM). Through detailed textual analysis and the establishment of a behavior-cause matrix, identifying five broad crash causes linked to 141 specific crash cause with behaviors, we uncover critical patterns such as the prominence of distracted driving in severe crashes. This comprehensive approach not only fills a critical research gap by linking behavior-cause relationships with injury severity but also sets the stage for developing targeted interventions to enhance road safety.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.