Articles published on Crash prediction
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
595 Search results
Sort by Recency
- New
- Research Article
- 10.1016/j.aap.2025.108352
- Mar 1, 2026
- Accident; analysis and prevention
- Junda Huang + 5 more
Real-time highway crash prediction based on SHAP-RFECV and electronic toll collection data: a new feature selection strategy.
- New
- Research Article
- 10.1016/j.aap.2025.108385
- Mar 1, 2026
- Accident; analysis and prevention
- Amin Keramati + 2 more
Time-to-event crash severity prediction at highway-rail grade crossings with monotonic neural networks.
- New
- Research Article
- 10.1016/j.aap.2025.108387
- Mar 1, 2026
- Accident; analysis and prevention
- Ling Deng + 5 more
Spatial-temporal gated transformer network for freeway secondary crash prediction considering the impact of class imbalance.
- Research Article
1
- 10.1016/j.trc.2025.105472
- Feb 1, 2026
- Transportation Research Part C: Emerging Technologies
- Samgyu Yang + 2 more
Crash prediction under limited CV coverage: an ensemble deep learning model integrating multi-source traffic data
- Research Article
- 10.3390/systems14010104
- Jan 19, 2026
- Systems
- Zhe Zhang + 6 more
The rapid advancement of autonomous vehicle systems (AVS) has introduced complex challenges to road safety. While some studies have investigated the contribution of factors influencing AV-involved crashes, few have focused on the impact of vehicle-specific factors within AVS on crash outcomes, a focus that gains importance due to the absence of a human driver. To address this gap, the advanced machine learning algorithm, LightGBM (v4.4.0), is employed to quantify the potential effects of vehicle factors on crash severity and collision types based on the Autonomous Vehicle Operation Incident Dataset (AVOID). The joint effects of different vehicle factors and the interactive effects of vehicle factors and environmental factors are studied. Compared with other frequently utilized machine learning techniques, LightGBM demonstrates superior performance. Furthermore, the SHapley Additive exPlanation (SHAP) approach is employed to interpret the results of LightGBM. The analysis of crash severity revealed the importance of investigating the vehicle characteristics of AVs. Operator type is the most predictive factor. For road types, highways and streets show a positive association with the model’s prediction of serious crashes. Crashes involving vulnerable road users can be attributed to different factors. The road type is the most significant factor, followed by precrash speed and mileage. This study identifies key predictive associations for the development of safer AV systems and provides data-driven insights to support regulatory strategies for autonomous driving technologies.
- Research Article
- 10.1080/19427867.2026.2613231
- Jan 11, 2026
- Transportation Letters
- Jingyang Li + 4 more
ABSTRACT Traffic crash analysis of mountainous freeways frequently relies on imbalanced data, hindering effective prediction of fatal crashes. Previous research has not fully explored performance improvement for fatal crash prediction, leading to insufficient model performance. This study proposes a two-stage prediction (TSP) framework based on Stacked Sparse Autoencoder (SSAE), which sequentially classifies each injury severity to enhance prediction performance. First, K-means is used for unsupervised clustering to improve data homogeneity. The Adaptive Synthetic Sampling Approach (ADASYN) balances the dataset by increasing sample size. Then, LightGBM with Partial Dependence Plot (PDP) analysis identifies key features and reveals their nonlinear relationships with injury severity. The TSP-SSAE model is compared with Support Vector Machine (SVM), LightGBM, and Deep Neural Network (DNN). Results show TSP-SSAE achieves higher accuracy, precision, recall, and F1-score. It effectively handles extreme data imbalance and improves predictive performance, particularly enhancing fatal crash prediction accuracy, thereby providing insights for traffic safety management.
- Research Article
- 10.3390/safety11040121
- Dec 9, 2025
- Safety
- Naima Goubraim + 3 more
Road traffic crashes are a major global challenge, resulting in significant loss of life, economic burden, and societal impact. This study seeks to enhance the precision of traffic accident prediction using advanced machine learning techniques. This study employs an ensemble learning approach combining the Random Forest, the Bagging Classifier (Bootstrap Aggregating), the Extreme Gradient Boosting (XGBoost) and the Light Gradient Boosting Machine (LightGBM) algorithms. To address class imbalance and feature relevance, we implement feature selection using the Extra Trees Classifier and oversampling using the Synthetic Minority Over-sampling Technique (SMOTE). Rigorous hyperparameter tuning is applied to optimize model performance. Our results show that the ensemble approach, coupled with hyperparameter optimization, significantly improves prediction accuracy. This research contributes to the development of more effective road safety strategies and can help to reduce the number of road accidents.
- Research Article
- 10.55329/mpaf3385
- Nov 7, 2025
- Traffic Safety Research
- Armin Kollascheck + 5 more
Turbo roundabouts are a relatively new design. Thus far, research on their safety has focused on comparisons with other types of intersection and the effects of physical lane dividers. This study investigates crash patterns at German turbo roundabouts, based on a largely complete sample of such roundabouts and detailed data on infrastructure characteristics, traffic volumes and crashes. We calculate the crash rates for turbo roundabouts as a whole, as well as for their individual elements, including entries and exits, and the circulatory roadway. In addition, crash constellations are analysed and crash prediction models are computed, for all crashes and for the two most relevant crash constellations right-of-way and rear-end crashes. The results confirm earlier research findings that turbo roundabouts effectively combine high capacity with high safety levels. The results add new insights, thanks to the more detailed analysis. Entries and right-of-way crashes are most relevant, followed by rear-end crashes which mainly occur at the circulatory roadway. Crash constellations differ significantly between the different elements of the roundabouts, traffic volumes increase crash numbers for all designs and elements. We find no significant effects of the different types of marked lane dividers in our sample, this suggests that drivers do not respect solid lines and that physical dividers are potentially needed to prevent drivers from changing lanes in the circulatory roadway. The detailed analysis of crash patterns in this study enables specific recommendations to be made for improving safety at turbo roundabouts further and exploiting their potential more effectively.
- Abstract
- 10.1093/eurpub/ckaf165.073
- Nov 1, 2025
- The European Journal of Public Health
- P Koliou + 1 more
BackgroundThe surge in smartphone-based telematics has opened new avenues in traffic safety research, enabling granular, real-time monitoring of driving behavior.ObjectivesThis study investigates the relationship between unsafe driving events—particularly harsh braking and acceleration—and crash occurrences in urban environments. By integrating data from smartphone applications, traffic control systems, and digital maps, the research aims to identify high-risk urban junctions, understand key behavioral and infrastructural risk factors, and inform targeted safety interventions.MethodsData were collected via the OSeven smartphone application from over 300 drivers operating across Mesogeion and Vouliagmenis Avenues in Athens, capturing more than 10,000 harsh acceleration and braking events. This behavioral data was enriched with traffic metrics (e.g., speed, occupancy, flow) from 26 sensor loops, and road infrastructure characteristics (e.g., lane counts, entrance/exit presence) from Google Maps.The study employed a multi-method framework, including:• Clustering Analysis: K-Means, DBSCAN, Hierarchical Clustering, and Gaussian Mixture Models (GMM) to identify patterns of unsafe driving.• Machine Learning: A Random Forest Regressor and SHAP values for feature importance and crash risk prediction.• Statistical Analysis: Generalized Linear Models (GLM) and descriptive statistics to understand correlations.• Dimensionality Reduction: Principal Component Analysis (PCA) for model simplification and insight clarity.• Spatial Analysis: Local Moran’s I and Geary’s C to identify high-risk clusters and outlier locations.To address class imbalance in crash prediction, SMOTE (Synthetic Minority Over-Sampling Technique) was applied.ResultsClustering revealed three dominant junction profiles, with one cluster (Cluster 1) showing high frequencies of harsh braking and high crash incidence, indicating urgent need for intervention. Random Forest models identified mean speed difference, braking probability (Prob_Brk), and braking frequency (Mod_Freq_Brk) as the strongest predictors of crash likelihood—accounting for over 85% of model importance collectively. Spatial analysis using Moran’s I and Geary’s C highlighted specific junctions (e.g., JM1, JM14, JM17) as hotspots of unsafe events, while also revealing spatial outliers with abnormal patterns. Multicollinearity analysis flagged high VIF scores for braking-related features (>75), suggesting strong internal correlation and justifying the use of PCA to ensure robust modeling.ConclusionsThis research demonstrates the value of smartphone-derived telematics in enhancing urban traffic safety. By combining advanced statistical and machine learning techniques, it presents a scalable, real-time solution for identifying high-risk locations and understanding the behavioral underpinnings of road crashes.Key messages• Driver behavior—especially harsh braking—is a more significant crash predictor than road design or traffic volume alone.• Clustering and spatial analysis enable a proactive approach to safety interventions, rather than waiting for crash reports.• The methodologies used can guide the design of smarter infrastructure, driver feedback systems, and targeted enforcement strategies. Future work should incorporate additional data sources (e.g., weather, lighting conditions), expand spatial scope, and experiment with deep learning methods for more nuanced behavioral insights.TopicTelematics Data, Crash Prediction, Urban Traffic Safety.
- Research Article
- 10.1186/s12245-025-01008-w
- Oct 14, 2025
- International Journal of Emergency Medicine
- Welawat Tienpratarn + 5 more
BackgroundTraumatic brain injury (TBI) is a significant health concern, with intracranial haemorrhage (ICH) being a common complication following injury. The CRASH prediction model plays a crucial role in clinical prognostication and decision-making within this patient group. However, external validation is critical to ensure the model’s validity and applicability across different populations and settings beyond those in which it was originally developed. This study aimed to validate the CRASH prediction model for 14-day mortality among TBI patients with ICH presenting to a Thai emergency department.MethodsThis retrospective study included adult TBI patients with ICH who visited the emergency department (ED) at Ramathibodi Hospital, Thailand, between 2020 and 2022. The Basic model, which incorporates age, Glasgow Coma Scale (GCS) score (3–15), pupillary reaction, and major extracranial injury, and the CT model, which extends the Basic model by including CT findings, were evaluated for their discriminative ability and calibration.ResultsA total of 232 patients were included in the validation dataset. Significant differences in clinical characteristics were observed between the datasets, including older age, predominance of mild TBI, subarachnoid hemorrhage, and non-evacuated hematoma in the validation dataset. The observed 14-day mortality rate in this cohort was 9.1%, compared to 20.7% in the development dataset. The area under the receiver operating characteristics curve (AuROC) was 0.92 (95% CI: 0.84, 1.00) for the Basic model and 0.93 (95% CI: 0.86, 1.00) for the CT model. However, the calibration for both models was fair. Recalibration achieved better predictive accuracy and reduced overestimation in high-risk groups.ConclusionThe original CRASH prediction model demonstrates strong discriminative ability for predicting 14-day mortality in TBI patients; however, significant miscalibration was observed. Recalibration was therefore undertaken to improve the model’s generalisability to local populations. Nonetheless, further studies are warranted to confirm the consistency and applicability of the recalibrated models.Supplementary InformationThe online version contains supplementary material available at 10.1186/s12245-025-01008-w.
- Research Article
- 10.1038/s41467-025-64574-w
- Oct 7, 2025
- Nature Communications
- Yang Zhao + 4 more
Predicting expected traffic crashes and designing targeted interventions are highly challenging due to the inherent complexity of crash data and persistent concerns over the prediction trustworthiness. We introduce SafeTraffic Copilot that adapts Large Language Models (LLMs) to perform expected crash prediction as a text-reasoning task, then attribute critical features for targeted safety interventions. Within the Copilot, SafeTraffic LLM is customized then fine-tuned on the textualized SafeTraffic Event dataset, which consists of 66,205 real-world crash cases with 14.5 million words from five U.S. states. Across multiple prediction tasks including crash type, severity, and number of injuries, SafeTraffic LLM demonstrates a 33.3% to 45.8% improvement in average F1-score over existing works. To interpret these results and inform safety interventions, we introduce SafeTraffic Attribution, a sentence-level feature-attribution framework enabling conditional “what-if" risk analysis. Findings reveal that alcohol-impaired driving is the leading factor for severe crashes, with impairment-related and aggressive behaviors contributing nearly three times more risk than other behaviors. Furthermore, SafeTraffic Attribution identifies critical features during fine-tuning, guiding crash data collection strategies for continual improvement. SafeTraffic Copilot enables prediction and reasoning of conditional crash risks through foundation models, thereby supporting traffic safety improvements and offering clear advantages in generalization, adaptation, and trustworthiness.
- Research Article
- 10.7717/peerj-cs.3131
- Oct 2, 2025
- PeerJ Computer Science
- Nusrat Jahan + 4 more
Road crashes have been viewed as one of the major issues leading to numerous economic losses, health problems, and fatalities, which are often due to driver actions (DA). Predicting effective DA for road crashes is crucial for developing effective intelligent transportation systems. The research community focused on transportation safety has made significant advancements in utilizing machine learning models to examine crash incidents in recent years. The application of various machine learning (ML) models has been widespread, but the specific focus on assessing DA has received relatively little attention. The article aims to propose a hybrid genetic algorithm combined with artficial neural network (GN-ANN) ML model to predict risky DA related to road accidents considering effective sampling strategies. This article also proposes a novel sampling strategy that combines Density-Based Spatial Clustering of Applications with Noise (DBSCAN) with Synthetic Minority Oversampling Technique (SMOTE)-Tomek Link named DBSTLink, where DBSCAN and SMOTE-Tomek Links are integrated to purify datasets from noise and outliers using DBSCAN and to balance class distribution by oversampling minority classes and deleting overlaps with SMOTE-Tomek Links to enhance classifier accuracy. This method is then compared with other sampling strategies like SMOTE, SMOTE Tomek Link, and DBSM (DBSCAN with SMOTE). The objective of this study is to strengthen the existing knowledge of crash probability by examining the influence of various data balancing with the proposed balancing approach on forecast F1-score, Matthew’s correlation coefficient (MCC), and G-mean. The results demonstrate that DBSTLink gives higher performance than other measures. The proposed hybrid GA-ANN machine learning model achieved an accuracy of 99%, an F1-score of 98%, and a recall of 99%. Additionally, it achieved a G-mean of 98% and an MCC of 96%. The research found the important attributes of DA that are responsible for road crashes.
- Research Article
- 10.4314/njt.v44i2.5
- Sep 30, 2025
- Nigerian Journal of Technology
- O Bayode + 3 more
Road traffic crash prediction (RTCP) is a critical aspect of transportation safety, enabling the identification of high-risk locations and informing the implementation of proactive measures. This study explores the comparative performance of Machine Learning (ML) algorithms and traditional Safety Performance Functions (SPFs) to predict road traffic crashes along the Lagos-Ibadan Expressway, a major highway in Nigeria known for its high crash rates. To achieve the objective, SPFs estimated using Negative Binomial Regression (NBR) and ML regression models mainly Support Vector Machine (SVM), Random Forest (RF) and Extreme Gradient Boosting (XGBoost) were developed using historical crash data collected from Federal Road Safety Commission (FRSC) of Nigeria for 10years duration between 2014 and 2023, traffic components and geometric design features as input variables. The study's findings indicate that ML algorithms outperform SPFs in terms of predictive accuracy and sensitivity to complex, non-linear relationships among crash-contributing factors with R2 of 0.99, 097 and 0.84 for training and 0.93,0.9 and 0.76 for testing dataset in the three ML models. However, SPFs remain advantageous in interpretation and ease of implementation. The analysis also highlights the importance of feature selection, with variables such as traffic volume, traffic speed, road curvature and pavement width emerging as significant predictors. Furthermore, this study offers insights for policymakers, traffic engineers, and researchers seeking to improve road safety outcomes through data-driven crash prediction methods. The results emphasize the potential of integrating ML techniques with traditional methods to develop hybrid frameworks for enhanced crash prediction and prevention strategies on high-risk roadways.
- Research Article
- 10.1080/19439962.2025.2554097
- Sep 16, 2025
- Journal of Transportation Safety & Security
- Bo Yang + 5 more
Secondary crashes can lead to great casualties and economic losses. Although nonparametric models have achieved suitable effects on crash prediction in previous studies, neither single nonparametric models nor hybrid nonparametric models have made full use of the different advantages of the machine learning method and deep learning method. In this study, the hybrid model was developed by combining the Mixture of Experts (MoE) and the Light Gradient Boosting Machine (LightGBM). The original voting classifier cannot directly accept deep learning models as the basis classifier. Thus, a new voting classifier was constructed by MoE. The traffic data were analyzed and modeled through the LightGBM, and the advanced features in data were learned through the MoE. Results showed that the hybrid model developed in this study has better predictive performance in terms of AUC, specificity, and sensitivity than single machine learning models and other hybrid models. The results of the ablation experiment verified the validity of the submodels in the hybrid model, and the hybrid model obtained better prediction performance in original sampling than in smote sampling and undersampling. In addition, SHapley Additive exPlanation was used to analyze the secondary crash impact features.
- Research Article
- 10.24017/science.2025.2.10
- Sep 6, 2025
- Kurdistan Journal of Applied Research
- Mariwan Askander Abdulla + 1 more
Nowadays, highway safety is a vital issue because vehicle crashes cause tremendous human, economic, social, and environment losses. This study asses intersections’ safety performance in Sulaimani urban street network where the number of vehicles has been rapidly growing, as the case study. Crash prediction models were developed and applied to assess the safety performance of the intersections. The crash data were reported from Sulaimani traffic police station, happened from January 2020 to September 2024. Besides the crash prediction models mentioned in the Highway Safety Manual (HSM), local crash prediction models for each selected intersections were developed, then the models were used as tools for assessing intersections safety performance. To know the intersections risk levels, five safety performance approaches were used namely Level of Safety Service, Excess Porengicted Average Crash Frequency using Safety Performance Function, Expected Average Crash Frequency with Empirical Bayes (EB) Adjustment, Equivalent Property Damage Only with EB Adjustment, and Excess Expected Average Crash Frequency with EB Adjustments. The results indicate that the local prediction model has a higher R² than the HSM model, indicating a better fit to the local traffic and road conditions specifically at four-leg signalized intersections, the local model achieved an R² value of 0.618, which is substantially higher than the 0.208 obtained from the HSM models. Moreoveresults show that four-leg signalized intersections have significantly higher crash rates, with 15 intersections identified as high-risk across both models. The findings offer practical insights for prioritizing safety improvements and resource allocation to enhance traffic safety in urban areas.
- Research Article
1
- 10.1016/j.cstp.2025.101530
- Sep 1, 2025
- Case Studies on Transport Policy
- Ali Ahmed Mohammed + 6 more
Management and prediction of traffic crashes on residential streets in Iraq using the expert system (MPTCRSI-ES)
- Research Article
- 10.1016/j.aap.2025.108169
- Sep 1, 2025
- Accident; analysis and prevention
- Ling Deng + 4 more
Interpretable multi-variable transformer network for regional-level short-term bicycle crash risk prediction.
- Research Article
- 10.3390/infrastructures10080216
- Aug 18, 2025
- Infrastructures
- Yubo Wang + 3 more
Identification of causal factors in traffic crashes has always been a significant challenge in road safety studies. Traditional crash prediction models are limited in elucidating the underlying causal mechanisms in road crashes. This research explores the application of three graphic models, namely, the Gaussian graphical model (GGM), causal Bayesian network (CBN) and graphic extreme gradient boosting (XGBoost), through a case study using highway–railroad-grade crossing (HRGC) inventory and collision data from Canada. The three modelling approaches have generally yielded consistent findings on various risk factors such as crossing control type, track angle, and exposure, showing their potential for identifying causal relationships through the interpretation of causal graphs. With the ability to make better causal inferences from crash data, the effectiveness of safety countermeasures could be more accurately and reliably estimated.
- Research Article
1
- 10.1016/j.aap.2025.108078
- Aug 1, 2025
- Accident; analysis and prevention
- Oluwaseun Olufowobi + 3 more
Safety effectiveness of forward collision warning systems in the vehicle fleet: A driving simulation study.
- Research Article
- 10.1016/j.aap.2025.108074
- Aug 1, 2025
- Accident; analysis and prevention
- Wei Zhang + 1 more
Interchange configurations safety comparison tool.