Abstract
Classification is separating data into predefined categories by obtaining descriptive features. In the classification process, both machine learning and deep learning algorithms assume that the class samples are evenly distributed. In particular, the dataset size used in deep learning is significant for classification success. However, obtaining balanced data distribution in real-life problems is very difficult. This negatively affects class-based accuracy. Various methods are used in the literature to overcome the unbalanced data problem. This study investigated the effects of GAN, SMOTE, and VAE methods on ECG data. For this purpose, the heartbeat signals in the MIT-BIH dataset were used. To test the performance of the methods, a performance comparison was made using real and synthetic data, and finally, the model trained with synthetic data was tested with real data. According to the results, 96.5% accuracy was obtained with the real data. The highest classification accuracy of 100.0% was obtained in VAE when using only synthetic data. In training with synthetic data and test results with real data, the highest classification success was 86.4% with SMOTE. When synthetic and real data sets are used together, the highest success rate is 98.6% with VAE. In addition, the accuracy of all classes is evenly distributed after data augmentation.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have