Abstract

Class imbalance is a critical issue in customer classification, for which a plethora of techniques have been proposed in the current body of literature. In particular, generative adversarial network (GAN)-based oversampling can capture the true data distribution of minority class samples and generate new samples, and this approach has demonstrated an outstanding ability to address class imbalance. However, GAN-based oversampling suffers from the issue of class overlap. As a result, in this work, we propose a novel a novel GAN-based hybrid sampling method. The new approach first uses GAN-based oversampling to generate the initial balanced dataset and then applies a novel adaptive neighborhood-based weighted undersampling method to remove generated instances and original majority class instances. This approach not only produces instances that fit the actual data distribution but also significantly reduces the influence of class overlap. Experimental results on artificial data and real-world customer datasets show that the proposed GAN-based hybrid sampling method has better performance than other benchmark methods with both accuracy-based and profit-based evaluation metrics.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call