Improving Fraud Detection in An Imbalanced Class Distribution Using Different Oversampling Techniques

Raneem Qaddoura,Mariam M Biltawi

doi:10.1109/eiceeai56378.2022.10050500

Abstract

Credit card fraud detection is essential for financial institutions to avoid charging customers for items they did not purchase. Fraud detection can be implemented through ML by building a model trained on a dataset containing transactions with fraud and non-fraud classes. The dataset available for this task is usually highly imbalanced. Therefore, the goal of this paper is to conduct a comprehensive comparison between five oversampling techniques. The oversampling techniques are the Synthetic Minority Oversampling Technique (SMOTE), Adaptive Synthetic Sampling (ADASYN), borderline1 SMOTE, borderline2 SMOTE, and Support Vector Machine SMOTE (SVM SMOTE) to generate an enhanced model which can solve the imbalanced problem. The comparison is conducted by computing the geometric mean, recall, precision, and F1-score of six machine learning models with and without applying oversampling. The ML models experimented with are logistic regression, random forest, K-nearest neighbor, naive Bayes, support vector machine, and decision tree. Experimental results show that the oversampling techniques have improved the models' performance.

Full Text