A Neural Network Ensemble With Feature Engineering for Improved Credit Card Fraud Detection

Ebenezer Esenogho,Theo G Swart,George Obaido,Kehinde Aruleba,Ibomoiye Domor Mienye

doi:10.1109/access.2022.3148298

Ebenezer Esenogho, Theo G Swart + Show 3 more

Open Access

https://doi.org/10.1109/access.2022.3148298

Copy DOI

Abstract

Recent advancements in electronic commerce and communication systems have significantly increased the use of credit cards for both online and regular transactions. However, there has been a steady rise in fraudulent credit card transactions, costing financial companies huge losses every year. The development of effective fraud detection algorithms is vital in minimizing these losses, but it is challenging because most credit card datasets are highly imbalanced. Also, using conventional machine learning algorithms for credit card fraud detection is inefficient due to their design, which involves a static mapping of the input vector to output vectors. Therefore, they cannot adapt to the dynamic shopping behavior of credit card clients. This paper proposes an efficient approach to detect credit card fraud using a neural network ensemble classifier and a hybrid data resampling method. The ensemble classifier is obtained using a long short-term memory (LSTM) neural network as the base learner in the adaptive boosting (AdaBoost) technique. Meanwhile, the hybrid resampling is achieved using the synthetic minority oversampling technique and edited nearest neighbor (SMOTE-ENN) method. The effectiveness of the proposed method is demonstrated using publicly available real-world credit card transaction datasets. The performance of the proposed approach is benchmarked against the following algorithms: support vector machine (SVM), multilayer perceptron (MLP), decision tree, traditional AdaBoost, and LSTM. The experimental results show that the classifiers performed better when trained with the resampled data, and the proposed LSTM ensemble outperformed the other algorithms by obtaining a sensitivity and specificity of 0.996 and 0.998, respectively.

Highlights

I N the past decade, there has been a rise in e-commerce, which has increased credit card utilization significantly
The proposed long short-term memory (LSTM) ensemble is benchmarked against some classifiers, including the support vector machine (SVM), multilayer perceptron (MLP), decision tree, LSTM, and the traditional AdaBoost
We performed experiments using the original and resampled datasets to demonstrate the impact of the synthetic minority oversampling technique (SMOTE)-edited nearest neighbor (ENN) resampling technique on the performance of the various classifiers

Summary

Introduction

I N the past decade, there has been a rise in e-commerce, which has increased credit card utilization significantly. A recent report showed that about 27.85 billion dollars were lost to credit card fraud in 2018, a 16.2% increase compared to the 23.97 billion dollars lost in 2017, and it is estimated to reach 35 billion dollars by 2023 [2]. These losses can be reduced through efficient fraud monitoring and prevention. The dataset was prepared by the Université Libre de Bruxelles (ULB) Machine Learning Group on big data mining and fraud detection [9]. The dataset is imbalanced, with only 492 fraudulent transactions out of 284 807. The attribute “Class” is the dependent variable, and it has a value of 1 for fraudulent transactions and 0 for legitimate transactions

Methods

Results

Conclusion