Using Outlier Modification Rule for Improvement of the Performance of Classification Algorithms in the Case of Financial Data

Md Rabiul Auwul,Fahmida Tasnim Dhonno,Nusrat Afrin Shilpa,Ashrafuzzaman Sohag,Md Ajijul Hakim,Mohammad Zoynul Abedin

doi:10.1007/978-3-031-18552-6_5

Abstract

AbstractThis study aims to improve the performance of Data Analytics (DA) algorithms by mining outliers from credit card fraud detection datasets. In doing so, we analyze the performance of data analytics algorithms, such as Linear Discriminant Analysis (LDA), k-Nearest Neighbor (k-NN), Naïve Bayes (NB) and Support Vector Machine (SVM), by comparing the original and modified datasets in the absence and presence of outliers. To generate modified dataset, this chapter proposes an outlier mining method based on Median (MED) and Median Absolute Deviation (MAD). Performance measures such as accuracy, sensitivity, specificity, detection rate, misclassification error rate, AUC, and pAUC evaluate the performance of the DA algorithms. Empirical findings show that the performance of the DA algorithms on modified dataset shows better results than the original data for both simulated dataset and real-life credit card datasets. This study offers new insights into financial decision makers and stakeholders in the credit card industry.KeywordsFinancial dataClassificationOutlier detectionModification

Full Text