Adaptive clustering algorithm based on kNN and density

Bing Shi,Lixin Han,Hong Yan

doi:10.1016/j.patrec.2018.01.020

Abstract

Although many clustering algorithms have been proposed, they all have various limitations. Existing clustering algorithms usually require the user to set the appropriate threshold parameters and those parameters are usually not adaptive. In this paper, a new clustering method is proposed. Firstly, the k nearest neighbors of all samples is calculated, and then a density method based on kNN is used to complete the clustering process. In order to achieve this goal, a statistics alpha is proposed for measuring the degree of denseness or sparseness of distribution of the entire samples, and a statistical model is proposed to calculate the local parameters, beta and gamma. That is, the category of each sample is determined by the combination of kNN with density. That is to say, the total number of samples determines an appropriate k value to obtain the ordered kNN of every sample, and then the radius of surrounding region (SR) is calculated by 5NN of all samples. Furthermore, the density of every sample is calculated. In the whole process of clustering, the global threshold is determined by the density distribution of all samples, and then the local threshold is self-adaptive. All sample density is sorted to search automatically for clusters from the highest point of density of the distribution of all samples. The algorithm can not only discover clusters of arbitrary shapes and automatically remove noise and outliers, but also it can find clusters with different densities and those with internal density variation. The results of experiments on the synthetic data and human face image data demonstrate that the algorithm is effective. The code is available under the following URL: https://github.com/bing-shi/Acnd.

Full Text