Abstract

Crash severity has been extensively studied and numerous methods have been developed for investigating the relationship between crash outcome and explanatory variables. Crash severity data are often characterized by highly imbalanced severity distributions, with most crashes in the Property-Damage-Only (PDO) category and the severe crash category making up only a fraction of the total observations. Many methods perform better on outcome categories with the most observations than other categories. This often leads to a high modeling accuracy for PDO crashes but poor accuracies for other severity categories. This research introduces two ensemble methods to model imbalanced crash severity data: AdaBoost and Gradient Boosting. It also adopts a more reasonable performance metric, F1 score, for model selection. It is found that AdaBoost and Gradient Boosting outperform other benchmark methods and generate more balanced prediction accuracies. Additionally, a global sensitivity analysis is adopted to determine the individual and joint impacts of explanatory factors on crash severity outcome. Vertical curve, seat belt use, accident type, road characteristics, and truck percentage are found to be the most influential factors. Finally, a simulation-based approach is used to further study how the impact of a particular factor may vary with respect to different value ranges.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.