Abstract

Over the last few decades, credit card fraud (CCF) has been a severe problem for both cardholders and card providers. Credit card transactions are fast expanding as internet technology advances, significantly relying on the internet. With advanced technology and increased credit card usage, fraud rates are becoming a problem for the economy. However, the credit card dataset is highly imbalanced and skewed. Many classification techniques are used to classify fraud and non-fraud but in a certain condition, they may not generate the best results. Different types of sampling techniques such as under-over sampling, Synthetic Minority Oversampling, and Adaptive synthetic techniques have been used to overcome the class imbalance problem in the credit card dataset. Then, the sampled datasets are classified using different machine learning techniques like Decision Tree, Random Forest, K-Nearest Neighbors, Logistic Regression, and Naive Bayes. Recall, F1- score, accuracy, precision, and error rate used to evaluate the model performance. The Logistic Regression model achieved the highest result with 99.94% after under sampling techniques and Random Forest model achieved the highest result with 99.964% after over sampling techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call