Abstract The existence of fraud in credit card transactions causes many financial losses leading to customers’ loss of trust. Fraud detection methods based on machine learning techniques prevent such losses. Despite the literature on fraud detection, there is a lack of algorithms that detect fraud with acceptable performance in the credit card fraud detection field. Therefore, this study proposed a comprehensive ensemble-based method using an efficient weighted voting strategy for credit card fraud detection that can address the previous algorithms’ weaknesses. First, since the dataset is imbalanced, the proposed method balanced the dataset by stratifying it into three different proportions of normal and fraudulent transactions (1 to 1, 1 to 4 and 1 to 9 ratios). The features in each dataset are ranked by four feature-ranking methods, and the Random Forest classifier is applied to each of them for selecting the essential features. Afterward, using the seven base classifiers and the chosen features, 12 ensembles have been developed. Besides, a weighted voting strategy is proposed, and the fraudulent transactions are detected through voting based on the base classifiers’ and ensembles’ weights, which are calculated by their performance. The computational results indicated that the suggested Eclf10 is the best ensemble and its Logistic Regression classifier also has the best performance among other base classifiers. The Eclf10 leads to 99.97% accuracy, 87.78% precision, 97.70% recall, 92.21% F1-score and 95.634% F2-score, which has a superiority over the previous ensemble-based methods (e.g. majority voting ensemble, stacking classifier, Adaboost, Gradient Boosting).
Read full abstract