Learning from class-imbalanced data using misclassification-focusing generative adversarial networks

Jaesub Yun,Jong-Seok Lee

doi:10.1016/j.eswa.2023.122288

Abstract

This paper presents a novel end-to-end oversampling-classification approach, which we refer to as imbalanced data-classifying generative adversarial network (ImbGAN), for imbalanced data classification. ImbGAN has a classifier-embedded structure within a GAN and consists of five components: (1) generator, (2) discriminator, (3) classifier, (4) storage for misclassified minority class data, and (5) storage for artificial minority class data. By iterative interaction with the embedded classifier, the first two components generate artificial minority class instances that are similar to minority class instances misclassified by the classifier. Therefore, these three networks are iteratively and simultaneously trained. The misclassified and artificial minority class instances are stored in the fourth and fifth components, respectively. These two components are also updated as iterations proceed. Our method obtains the final classification model from a single learning process, while most artificial data generation methods for imbalanced data classification go through an additional process for training classifiers after artificial data generation. Numerical experiments based on tabular, image, and text datasets confirm that the proposed method outperforms well-known synthetic sampling methods.

Full Text