The Effect of Recursive Feature Elimination with Cross-Validation (RFECV) Feature Selection Algorithm toward Classifier Performance on Credit Card Fraud Detection

Adi Zaenul Mustaqim,Yoga Pristyanto,Yuli Astuti,Sumarni Adi

doi:10.1109/icaicst53116.2021.9497842

Abstract

Credit cards are one of the most popular non-cash payment methods used by the public. Credit cards are considered as one part of the lifestyle of modern society. However, transactions using credit cards are often fraudulent. For this reason, Fraud Detection classification is carried out on credit card transactions. The data mining process is used to overcome this. The data mining process is carried out by training a dataset to be able to classify fraud in credit card transactions. The use of high-dimensional datasets can cause problems with accuracy and training time in the classification process. There are several ways to overcome this, one of which is by doing feature selection. This study conducted to perform feature selection on the dataset to select the best attributes for classification that influence the classification results. The algorithm used is Recursive Feature Elimination with Cross-Validation (RFECV). Based on research that has been done with 3 different k values, namely k=5, k=10, and k=15, the RFECV algorithm can reduce the accuracy of the Decision Tree (DT) classification algorithm. Meanwhile, in the Naïve Bayes (NB) algorithm, the RFECV feature selection algorithm does not affect the evaluation results. The evaluation results on the NB algorithm before and after the application of the RFECV algorithm did not change. In the RFECV algorithm, the greater the value of k, the fewer attributes selected tend to be. RFECV with values of k=5 and k=10 displays the best 13 attributes. While the RFECV with a value of k=15 displays the 10 best attributes. The application of the RFECV algorithm can reduce computational time during the classification process using DT and NB. It is because, with the application of a larger k value, fewer attributes are used for the classification process, thus accelerating the classification process.

Full Text