Abstract

Abstract. The use of credit cards is becoming more and more popular in today's society, especially now with the prevalence of electronic payments. Electronic payments are available through bank apps or payment processors like PayPal, Alipay, etc. Without using cash, people are taking advantage of the convenience of them both online and offline. In addition to bringing people the benefits of efficiency and convenience, credit card fraud has also emerged and caused a great deal of economic losses for cardholders, as well as causing great trouble for banks. The primary goal of this work is to identify fraudulent transactions in an unbalanced dataset. The dataset comprises credit card transactions from just two days in 2013 in Europe. In this study, the original data, the original data with Stratified-KFolds, and the undersampled data will be compared. It is found that undersampling, although it reduces the accuracy by a small amount, can greatly improve the detection of fraudulent transactions. Meanwhile, this study uses different models, one is Logistic Regression, and the others are all Tree-Based method. The study analyzes their confusion matrices, ROC curves, and Precession-Recall curves. The results show that in the undersampling rate dataset, the recall, precision, F1 score and accuracy of Xgboost are optimized to 93.2%, 97%, 95%, and 95% respectively, and the AUC of both the ROC curve and the Presicion-Recall curve are optimized to 99%, so this study concludes that XGboost is the best performer. With excellent algorithms, we can better avoid the leakage of information and loss of money in real life.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.