Classification of ECG Signals Using GAN, SMOTE, and VAE Data Augmentation Methods: Synthetic vs. Real

Turgut Özseven

doi:10.17798/bitlisfen.1523524

Turgut Özseven

https://doi.org/10.17798/bitlisfen.1523524

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Classification is separating data into predefined categories by obtaining descriptive features. In the classification process, both machine learning and deep learning algorithms assume that the class samples are evenly distributed. In particular, the dataset size used in deep learning is significant for classification success. However, obtaining balanced data distribution in real-life problems is very difficult. This negatively affects class-based accuracy. Various methods are used in the literature to overcome the unbalanced data problem. This study investigated the effects of GAN, SMOTE, and VAE methods on ECG data. For this purpose, the heartbeat signals in the MIT-BIH dataset were used. To test the performance of the methods, a performance comparison was made using real and synthetic data, and finally, the model trained with synthetic data was tested with real data. According to the results, 96.5% accuracy was obtained with the real data. The highest classification accuracy of 100.0% was obtained in VAE when using only synthetic data. In training with synthetic data and test results with real data, the highest classification success was 86.4% with SMOTE. When synthetic and real data sets are used together, the highest success rate is 98.6% with VAE. In addition, the accuracy of all classes is evenly distributed after data augmentation.

Full Text