Abstract

Data quality is an essential element necessary for the development of a successful machine-learning project. One of the biggest challenges in various real-world application domains is class imbalance. This paper proposes a new framework for oversampling credit data by combining two deep learning techniques: autoencoders and generative adversarial networks. A trivial autoencoder (TAE) is used to change data representation, and modified generative adversarial networks (GAN) are used to create new instances from random noise. The experiment on three different datasets demonstrates that the same classifier achieves a better area under the receiver operating characteristic curve (AUC) on datasets augmented by the proposed framework compared to datasets oversampled by other techniques. Additionally, the results show that datasets balanced by the new framework influence the classifier to change the prediction error types, significantly reducing false negatives; more expensive misclassification case in the imbalance learning. The improvements are significant, and considering the change in error distribution, the proposed technique is an excellent complement to existing oversampling techniques.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.