Risk prediction model of bank telecommunication fraud based on XGBoost

Siyuan Wu,Derong Yang,Wenjun Ge,Baoqin Chen

doi:10.1117/12.2681646

Abstract

The digital economy is booming, but cybercrimes and telecommunication frauds are emerging one after another. How to detect fraudulent behaviours and prevent the occurrence of crimes is a significant challenge. This paper mainly conducts data mining and analysis on the bank card telecommunication fraud data set, first of all, data mining and feature engineering for the given data set, including analyzing the data integrity, the overall statistical analysis of the data and standardizing the data using the Z-Score standardization method, Use the Pearson correlation coefficient to explore the feature correlation, use the SMOTE method to balance the data set, and finally divide the training set and the test set. Subsequently, four machine learning classification models, including the logistic regression classification model, KNN classification model, decision tree classification model and XGBoost classification model, were established to predict and classify fraudulent behaviours preliminarily. To further mine the data set of bank card telecommunication fraud, the optimal solutions of the models are obtained by grid tuning and cross-validation for the four established models. After experiments, the logistic regression classification model, KNN classification model, decision tree classification model and XGBoost classification The prediction accuracy rates of the model in the test set are 93.45%, 99.85%, 99.92%, and 99.94%, respectively. It is preliminarily believed that the XGBoost and decision tree classification models have excellent classification capabilities. Use the obtained four optimal models to calculate the three performance evaluation indicators of prediction accuracy, recall rate and F1 value in the test set, respectively, and further evaluate the four machine learning models. Through comparative analysis, the XGBoost classification model has the best performance. Due to its classification ability, strong generalization ability and robustness, it is selected as the final bank card telecommunication fraud prediction model. In addition, the P-R curve and ROC curve of the classification results are drawn using the performance evaluation indicators to be intuitive. Analysis of the model's performance further shows that XGBoost has better generalization ability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Risk prediction model of bank telecommunication fraud based on XGBoost

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Application of serum protein profiling in diagnosis, prognosis and evaluation of curative effect of pancreatic adenocarcinoma
Shun-Cai Zhang ... Ping Liao
Chinese journal of oncology | VOL. 32
Shun-Cai Zhang, et. al.Shun-Cai Zhang ... Ping Liao
01 Jan 2009
Chinese journal of oncology | VOL. 32

An Efficient Machine Learning Classification model for Credit Approval
Dr M V Rajesh ... A Lakshmanarao
-
Dr M V Rajesh, et. al.Dr M V Rajesh ... A Lakshmanarao
02 Feb 2023
02 Feb 2023

Failure Severity Prediction for Protective-Coating Disbondment via the Classification of Acoustic Emission Signals.
Noor A’In A Rahman ... Mehwish Hanif
Sensors (Basel, Switzerland) | VOL. 23
Noor A’In A Rahman, et. al.Noor A’In A Rahman ... Mehwish Hanif
31 Jul 2023
Sensors (Basel, Switzerland) | VOL. 23

A Comparative Study of Data-Intensive Demand Modeling Techniques in Relation to Product Design and Development
Conrad S Tucker ... Harrison M Kim
-
Conrad S Tucker, et. al.Conrad S Tucker ... Harrison M Kim
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Risk prediction model of bank telecommunication fraud based on XGBoost

Abstract

Talk to us

Similar Papers