Crash data augmentation using variational autoencoder

Zubayer Islam,Mohamed Abdel-Aty,Qing Cai,Jinghui Yuan

doi:10.1016/j.aap.2020.105950

Abstract

In this paper, we present a data augmentation technique to reproduce crash data. The dataset comprising crash and non-crash events are extremely imbalanced. For instance, the dataset used in this paper consists of only 625 crash events for over 6.5 million non-crash events. Thus, learning algorithms tend to perform poorly on these datasets. We have used variational autoencoder to encode all the events into a latent space. After training, the model could successfully separate crash and non-crash events. To generate data, we sampled from the latent space containing crash data. The generated data was compared with the real data from different statistical aspects. t-Test, Levene-test and Kolmogrove Smirnov test showed that the generated data was statistically similar to the real data. It was also compared to some of the minority oversampling techniques like SMOTE and ADASYN as well as the GAN framework for generating data. Crash prediction models based on Logistic Regression (LR), Support Vector Machine (SVM) and Artificial Neural Network (ANN) were used to compare the generated data from the different oversampling techniques. Overall, variational autoencoder (VAE) showed excellent results compared to the other data augmentation methods. Specificity is improved by 8% and 4% for VAE-LR and VAE-SVM respectively when compared to SMOTE while the sensitivity is improved by 6% and 5% when compared to ADASYN. Moreover, VAE generated data also helps to overcome the overfitting problem in SMOTE and ADASYN since there is flexibility in choosing the decision boundary.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Crash data augmentation using variational autoencoder

Abstract

Talk to us

Similar Papers

More From: Accident Analysis and Prevention

Lead the way for us

Journal: Accident Analysis and Prevention	Publication Date: Dec 25, 2020
Citations: 115

Similar Papers

Motor Vehicle Occupant Injuries to Children in Crash and Noncrash Events
Phyllis A Agran ... Debora E Dunkle
Pediatrics | VOL. 70
Phyllis A Agran, et. al.Phyllis A Agran ... Debora E Dunkle
01 Dec 1982
Pediatrics | VOL. 70

Motor Vehicle Occupant Injuries in Noncrash Events
Phyllis F Agran
Pediatrics | VOL. 67
Phyllis F AgranPhyllis F Agran
01 Jun 1981
Pediatrics | VOL. 67

A comparison of data augmentation methods in voice pathology detection
Farhad Javanmardi ... Paavo Alku
Computer Speech & Language | VOL. 83
Farhad Javanmardi, et. al.Farhad Javanmardi ... Paavo Alku
11 Aug 2023
Computer Speech & Language | VOL. 83

Data Augmentation for Building Footprint Segmentation in SAR Images: An Empirical Study
Sandhi Wangiyana ... Artur Gromek
Remote Sensing | VOL. 14
Sandhi Wangiyana, et. al.Sandhi Wangiyana ... Artur Gromek
22 Apr 2022
Remote Sensing | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Crash data augmentation using variational autoencoder

Abstract

Talk to us

Similar Papers

More From: Accident Analysis and Prevention