Abstract

The promotion of digitalization has brought many emerging risks to the internet finance and other fields. For example, the fraudulent behavior in credit data can be regarded as outliers, which means that outliers themselves have very important significance. However, due to the high dimension of the real data set and the small number of outliers, most anomaly detection algorithms directly based on clustering are not effective, so it is necessary to find a method that can effectively solve the anomaly detection of non-balanced credit data sets in high-dimensional space. This paper detects outliers in credit data based on sparse subspace clustering undersampling, uses it to cluster high-dimensional and unbalanced credit data sets, uses clustering results as undersampling means to construct balanced data sets, and then uses classifier to detect outliers. Finally, the effectiveness of the proposed algorithm in credit data outlier detection is verified by comparative experiments, which makes up for the shortcomings of traditional clustering and high-dimensional space outlier detection algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call