A novel generative adversarial network for improving crash severity modeling with imbalanced data

Junlan Chen,Ziyuan Pu,Nan Zheng,Xiao Wen,Hongliang Ding,Xiucheng Guo

doi:10.1016/j.trc.2024.104642

Abstract

Traffic crash data is often greatly imbalanced with the majority of non-fatal crashes and only a small number of fatal crashes. Such data imbalance issue poses a challenge for crash severity modelling, especially for classifying and interpreting fatal crashes with very limited samples. To address the data imbalance issues, the data resampling techniques are commonly used methods to rebalance the number of samples among all categories of the dataset, such as under-sampling and over-sampling techniques. However, it is challenging for most traditional and existing deep learning-based resampling methods, e.g., synthetic minority oversampling technique (SMOTE) and Generative Adversarial Networks (GAN), to handle both continuous and discrete risk factors in traffic crash datasets, since they are built upon by smooth and continuous functions which are not applicable for processing discrete variables. Though some resampling methods are capable of handling both continuous and discrete variables, they may struggle with mode collapse issues associated with sparse discrete risk factors so that the diversity of the underlying data distribution can not be captured due to oversampling repetitive and similar samples. To address the aforementioned issues, the current study proposes a traffic crash data generation method based on the Conditional Tabular GAN (CTGAN) to rebalance crash datasets for improving performance of crash severity classification and interpretation. The designed experiments are conducted to evaluate contributions of the synthetic data for improving crash severity classification, the distribution consistency between synthetic and benchmark datasets, and the parameter recovery (i.e., the accuracy of parameter estimation and probability prediction) for various resampling strategies. A 4-year real-world dataset collected in Washington State, U.S., and Monte Carlo simulations are utilized for demonstrating the designed experiments. The results indicate that crash severity modeling using synthetic data generated by the mix-resampling of CTGAN and random under-sampling (CTGAN-RU) outperforms all baseline methods. In addition, the proposed deep generative method demonstrates the capability in maintaining distribution consistency and achieving accurate parameter recovery. This study can provide valuable insights for traffic safety researchers and engineers into crash severity modeling, especially when handling imbalanced crash data of various types.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel generative adversarial network for improving crash severity modeling with imbalanced data

Abstract

Talk to us

Similar Papers

More From: Transportation Research Part C

Lead the way for us

Similar Papers

Crash injury severity prediction considering data imbalance: A Wasserstein generative adversarial network with gradient penalty approach
Ye Li ... Haifei Yang
Accident Analysis & Prevention | VOL. 192
Ye Li, et. al.Ye Li ... Haifei Yang
31 Aug 2023
Accident Analysis & Prevention | VOL. 192

Analysing the Severity and Frequency of Traffic Crashes in Riyadh City Using Statistical Models
Saleh Altwaijri ... Mohammed Quddus
International Journal of Transportation Science and Technology | VOL. 1
Saleh Altwaijri, et. al.Saleh Altwaijri ... Mohammed Quddus
01 Dec 2012
International Journal of Transportation Science and Technology | VOL. 1

Comparison of Cluster-Based Sampling Approaches for Imbalanced Data of Crashes Involving Large Trucks
Syed As-Sadeq Tahfim ... Yan Chen
Information | VOL. 15
Syed As-Sadeq Tahfim, et. al.Syed As-Sadeq Tahfim ... Yan Chen
05 Mar 2024
Information | VOL. 15

Crash sequence based risk matrix for motorcycle crashes
Kun-Feng Wu ... Sheng-Yin Chen
Accident Analysis and Prevention | VOL. 117
Kun-Feng Wu, et. al.Kun-Feng Wu ... Sheng-Yin Chen
06 Apr 2018
Accident Analysis and Prevention | VOL. 117

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel generative adversarial network for improving crash severity modeling with imbalanced data

Abstract

Talk to us

Similar Papers

More From: Transportation Research Part C