The increasing presence of autonomous vehicles (AVs) in transportation, driven by advances in AI and robotics, requires a strong focus on safety in mixed-traffic environments to promote sustainable transportation systems. This study analyzes AV crashes in California using advanced machine learning to identify patterns among various crash factors. The main objective is to explore AV crash mechanisms by extracting association rules and developing a decision tree model to understand interactions between pre-crash conditions, driving states, crash types, severity, locations, and other variables. A multi-faceted approach, including statistical analysis, data mining, and machine learning, was used to model crash types. The SMOTE method addressed data imbalance, with models like CART, Apriori, RF, XGB, SHAP, and Pearson’s test applied for analysis. Findings reveal that rear-end crashes are the most common, making up over 50% of incidents. Side crashes at night are also frequent, while angular and head-on crashes tend to be more severe. The study identifies high-risk locations, such as complex unsignalized intersections, and highlights the need for improved AV sensor technology, AV–infrastructure coordination, and driver training. Technological advancements like V2V and V2I communication are suggested to significantly reduce the number and severity of specific types of crashes, thereby enhancing the overall safety and sustainability of transportation systems.
Read full abstract