Abstract
The problem of imbalanced datasets is a significant concern when creating reliable credit card fraud (CCF) detection systems. In this work, we study and evaluate recent advances in machine learning (ML) algorithms and deep reinforcement learning (DRL) used for CCF detection systems, including fraud and non-fraud labels. Based on two resampling approaches, SMOTE and ADASYN are used to resample the imbalanced CCF dataset. ML algorithms are, then, applied to this balanced dataset to establish CCF detection systems. Next, DRL is employed to create detection systems based on the imbalanced CCF dataset. The diverse classification metrics are indicated to thoroughly evaluate the performance of these ML and DRL models. Through empirical experiments, we identify the reliable degree of ML models based on two resampling approaches and DRL models for CCF detection. When SMOTE and ADASYN are used to resampling original CCF datasets before training/test split, the ML models show very high outcomes of above 99% accuracy. However, when these techniques are employed to resample for only the training CCF datasets, these ML models show lower results, particularly in terms of logistic regression with 1.81% precision and 3.55% F1 score for using ADASYN. Our work reveals the DRL model is ineffective and achieves low performance, with only 34.8% accuracy.
Highlights
In the fourth industrial revolution, the e-commerce platform has become the most extensive system for financial institutions
In [2], the SMOTE-edited nearest neighbor (ENN) method was found to be best for detecting the credit card fraud (CCF) compared with other different classifiers among a set of oversampling approaches, and the SMOTE-Tomek’s Links (TL) showed good outcomes according to the set of under-sampling techniques
In order to provide various solutions to deal with the imbalanced CCF dataset, all results related to the machine learning (ML) algorithms and the deep reinforcement learning (DRL) approach based on the imbalanced CCF
Summary
In the fourth industrial revolution, the e-commerce platform has become the most extensive system for financial institutions. The agent eventually searches for an optimal classification policy in the dataset based on a specific reward function and a beneficial learning environment. We develop the comparison of the performance of the DRL approach applied directly to the CCF imbalanced dataset and ML classifiers based on the resampling of the CCF dataset in order to analyze the contributions and limitations of the models related to the ML field for the CCF detection systems. SMOTE and ADASYN techniques are used to resample this imbalanced CCF dataset based on two resampling approaches. Applying the seven ML algorithms, i.e., KNN, LR, DT, RF, AdaBoost, XGBoost, and DNN, to the balanced CCF dataset obtained based on two resampling approaches in order to establish the CCF detection systems. We propose suitable algorithms for dealing with the imbalanced dataset effectively for the CCF detection systems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.