Abstract

This experiment compared the performance of four machine learning algorithms in detecting bank card fraud. At the same time, the strong imbalance of the classes in the training sample was taken into account, as well as the difference in transaction amounts, and the ability of different machine learning methods to recognize fraudulent behavior was assessed taking into account these features. It has been found that a method that works well with indicators for assessing a classification is not necessarily the best in terms of assessing the magnitude of economic losses. Logistic regression is a good proof of this. The results of this work show that the problem of detecting fraud with bank cards cannot be regarded as a simple classification problem. AUC data is not the most appropriate metric for fraud detection tasks. The final choice of the model depends on the needs of the bank, that is, it is necessary to take into account which of the two types of errors (FN, FP) will lead to large economic losses for the bank. If the bank believes that the loss caused by identifying fraudulent transactions as regular transactions is the main one, it should choose the algorithm with the lowest FN value, which in this experiment is Adaboost. If the bank believes that the negative impact of identifying regular transactions as fraudulent is also very important, it should choose an algorithm with relatively small FN and FP data. In this experiment, the overall performance of the random forest is better. Further, by evaluating the economic losses caused by false positives (identifying an ordinary transaction as fraudulent), a quantitative analysis of the economic losses caused by each algorithm can be used to select the optimal algorithm model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.