Abstract
The imbalanced classification problem has become greatest issue in many fields, especially in fraud detection. In fraud detection, the transaction datasets available for training are extremely imbalanced, with fraudulent transaction logs considerably less represented. Meanwhile, the feature information of the fraud samples with better classification capabilities cannot be mined directly by feature learning methods due to too few fraud samples. These significantly reduce the effectiveness of fraud detection models. In this paper, we proposed a Dual Autoencoders Generative Adversarial Network, which can balance the majority and minority classes and learn feature representations of normal and fraudulent transactions to improve the accuracy of the fraud detection. The new model firstly trains a Generative Adversarial Networks to output sufficient mimicked fraudulent transactions for autoencoder training. Then, two autoencoders are trained on the normal and fraud dataset, respectively. The samples are encoded by two autoencoders to obtain two sets of features, which are combined to form the dual autoencoding features. Finally, the model detects fraudulent transactions by a Neural Network trained on the augmented training set. Experimental results show that the model outperforms a set of well-known classification methods in experiments, especially the sensitivity and precision, which are effectively improved.
Highlights
With the continuous increase of online transactions via credit cards, more and more fraudulent transactions are increasingly produced, bringing great losses to banks, merchants, and cardholders
In order to make full use of the information of the samples in the dataset and alleviate the imbalanced-class problem, we proposed Dual Autoencoders Generative Adversarial Network (DAEGAN)
In order to solve the problem that the autoencoder cannot completely fit the fraud samples data, we propose to train the autoencoder AE_f on the augmented fraud training set x_f, which contains the real fraud samples and fake fraud samples generated by the first WGAN: AE _f arg min θθ x_f, gAE_f (fAE_f (x_f ))
Summary
With the continuous increase of online transactions via credit cards, more and more fraudulent transactions are increasingly produced, bringing great losses to banks, merchants, and cardholders. In the actual fraud detection dataset, the positive and negative samples are very imbalanced, and the extremely small number of fraudulent transaction records are available. This extremely imbalanced data may cause the classifier to produce biased results, because classifier may sacrifice the accuracy of the minority samples and treat them as noise [12]. In order to make full use of the information of the samples in the dataset and alleviate the imbalanced-class problem, we proposed Dual Autoencoders Generative Adversarial Network (DAEGAN). DAEGAN mines the feature information of fraud samples based on the augmented fraud dataset
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.