Abstract

In this digital world, numerous credit card-based transactions take place all over the world. Concomitantly, gaps in process flows and technology result in many fraudulent transactions. Owing to the spurt in the number of reported fraudulent transactions, customers and credit card service providers incur significant financial and reputation losses respectively. Therefore, building a powerful fraud detection system is paramount. It is noteworthy that fraud detection datasets, by nature, are highly unbalanced. Consequently, almost all of the supervised classifiers, when built on the unbalanced datasets, yield high false negative rates. But, the extant oversampling methods while reducing the false negatives, increase the false positives. In this paper, we propose a novel data oversampling method using Generative Adversarial Network (GAN). We use GAN and its variant to generate synthetic data of fraudulent transactions. To evaluate the effectiveness of the proposed method, we employ machine learning classifiers on the data balanced by GAN. Our proposed GAN-based oversampling method simultaneously achieved high precision, F1-score and dramatic reduction in the count of false positives compared to the state-of-the-art synthetic data generation based oversampling methods such as Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN) and random oversampling. Moreover, an ablation study involving the oversampling based on the ensemble of SMOTE and GAN/WGAN generated datasets indicated that it is outperformed by the proposed methods in terms of F1 score and false positive count.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call