Abstract
Machine learning algorithms have been widely applied and researched in the field of medical image classification. Most of the current algorithms are designed based on the assumption of data balance. However, in practical applications, medical data often face the problem of imbalanced data. The traditional data augmentation approaches can balance the original data, but they are unable to generate more effective features. Generative Adversarial Networks (GANs) can effectively perform sample augmentation. However, GAN often encounter issues such as intra-class mode collapse and generating noisy samples when faced with imbalanced data. This study proposes a GAN model called IBGAN that focuses on generating intra-class sparse samples and class boundary samples. Specifically, IBGAN addresses the issue of imbalanced data through two stages. In the first stage, the Isolation Forest (iForest) algorithm and boundary sample detection algorithm are designed to identify sparse region samples and boundary samples within the intra-class data. This enables the GAN model to effectively focus on generating such samples during training. The second stage involves the refinement process of the samples. We propose a sample evaluation method based on Support Vector Data Description (SVDD) to filter out noisy data in the generated samples, ensuring the quality of the generated data. We conducted extensive and in-depth experiments on five real-world datasets. The experimental results demonstrate that IBGAN can generate high-quality and diverse augmented samples, which contribute to the improvement of classifier performance. We compared our proposed method with seven baseline methods, including traditional approaches and classic GAN models. The experimental results, including the visualization of the generated data and the evaluation of classification performance, consistently show that our proposed method achieves more competitive results.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have