Abstract
In the research of credit loan fraud detection, the isolation forest algorithm has attracted much attention because of its ability to efficiently process large-scale data sets. However, when facing high-dimensional data, the performance of the isolation forest algorithm is easily affected, resulting in deviation of the detection results. In order to solve the above problems, this paper proposes an isolation forest anomaly detection algorithm based on multi-level sub-subspace division. Firstly, the random forest algorithm is used to evaluate the importance of each feature, and the data is divided into different subspaces according to the importance of each feature, and the corresponding weight is assigned to each subspace. Then, the isolation forest algorithm is applied in each subspace for anomaly detection, and the anomaly score of each subspace is obtained. Finally, the anomaly score and weight of each subspace were combined to obtain the final anomaly detection score. In order to evaluate the effectiveness of the algorithm, the proposed algorithm was compared with other four algorithms on the credit loan fraud data set. The results show that the AUC index, accuracy, recall rate and F1 score of the proposed algorithm are higher than those of the comparison algorithms, showing high effectiveness.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have