An Efficient Method for Boundary Points Detection Based on Data Expression

Pinjie Li,Lingyun Yang,Tao Zhang,Huangang Wang

doi:10.23919/chicc.2019.8866056

Abstract

Clustering analysis is the core technology of data mining. However, massive data analysis makes it difficult for traditional clustering algorithms to have better clustering effects. At present, a more effective way to solve such problems is to detect the cluster boundary by means of the boundary point detection algorithm, and then combine it with the clustering algorithm. In this paper, we propose a clustering boundary points detection algorithm based on data representation. It does not need to understand the dimension and the meaning of data. It is only necessary to construct the expression matrix W through a specific algorithm and then we can recognize the boundary points of the data set through computing the number of zero and negative components of W, which helps us capture the inner structure of data about clustered data. Meanwhile, our algorithm overcomes the dimensional disaster of traditional algorithms on high-dimensional data sets. In addition, the algorithm has a high recall rate and accuracy on both synthetic data sets and real data sets.

Full Text