Abstract

Machine learning is an innovative and efficient tool to prevent credit card fraud, however, given the variety of machine learning models, which model is the most suitable for fraudulent transaction predictions becomes a tough question to answer. In this research, a comprehensive evaluation method is borrowed to compare performances between different machine learning models. More precisely, this research uses the Area under the ROC Curve (AUC) metric to evaluate and compare performances between four different machine learning models with the same transaction information dataset. The four models are K Nearest Neighbor, Logistic Regression, Random Forest, and Support Vector Machine. In this research, a dataset that contains over one million credit card transaction data is processed and divided into training data and testing data. After preprocessing, the same training data are fitted into four different models and being test against the same testing data. After a series of hyperparameter tuning, the AUC score of each model is obtained and compared. The comparison result indicates that Random Forest makes the most accurate and consistent predictions on fraudulent transactions in this dataset, and thus can be recommended as the primary machine learning algorithm to prevent credit card fraudulent transactions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call