Credit Card Fraud Detection under Extreme Imbalanced Data: A Comparative Study of Data-level Algorithms

Amit Singh,Ranjeet Kumar Ranjan,Abhishek Tiwari

doi:10.1080/0952813x.2021.1907795

Abstract

ABSTRACT Credit card fraud is one of the biggest cybercrimes faced by users. Intelligent machine learning based fraudulent transaction detection systems are very effective in real-world scenarios. However, while designing these systems, machine learning approaches suffer from the problem of imbalanced data, i.e. imbalanced class distribution. Therefore, balancing the dataset becomes an imperative sub-task. Investigation of state-of-the-art approaches reveals that there is a need for a systematic study of class imbalance handling strategies to design an intelligent and capable system to detect the fraudulent transaction. This work aims to provide a comparative study of different class imbalance handling methods. To compare the effectiveness and efficiency of different class imbalance approaches in conjunction with state-of-the-art classification approaches, we have performed an extensive experimental study. We compared these methods on many performance indicators such as Precision, Recall, K-fold Cross-validation, AUC-ROC curve and execution time. In this study, we found that the Oversampling followed by Undersampling methods performs well for ensemble classification models such as AdaBoost, XGBoost and Random Forest.

Full Text