Abstract

Financial fraud is a threat which is increasing on a greater pace and has a very bad impact over the economy, collaborative institutions and administration. Credit card transactions are increasing faster because of the advancement in internet technology which leads to high dependence over internet. With the up-gradation of technology and increase in usage of credit cards, fraud rates become challenge for economy. With inclusion of new security features in credit card transactions the fraudsters are also developing new patterns or loopholes to chase the transactions. As a result of which behavior of frauds and normal transactions change constantly. Also the problem with the credit card data is that it is highly skewed which leads to inefficient prediction of fraudulent transactions. In order to achieve the better result, imbalanced or skewed data is pre-processed with the re-sampling (over-sampling or under sampling) technique for better results. The three different proportions of datasets were used in this study and random under-sampling technique was used for skewed dataset. This work uses the three machine learning algorithms namely: logistic regression, Naive Bayes and K-nearest neighbour. The performance of these algorithms is recorded with their comparative analysis. The work is implemented in python and the performance of the algorithms is measured based on accuracy, sensitivity, specificity, precision, F-measure and area under curve. On the basis these measurements logistic regression based model for prediction of fraudulent was found to be a better in comparison to other prediction models developed from Naive Bayes and K-nearest neighbour. Better results are also seen by applying under sampling techniques over the data before developing the prediction model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call