Abstract

Clustering is widely used in data mining and machine learning. The possibilistic c-means clustering (PCM) method loosens the constraint of the fuzzy c-means clustering (FCM) method to solve the problem of noise sensitivity of FCM. But there is also a new problem: overlapping cluster centers are not suitable for clustering non-cluster distribution data. We propose a novel possibilistic c-means clustering method based on the nearest-neighbour isolation similarity in this paper. All samples are taken as the initial cluster centers in the proposed approach to obtain k sub-clusters iteratively. Then the first b samples farthest from the center of each sub-cluster are chosen to represent the sub-cluster. Afterward, sub-clusters are mapped to the distinguishable space by using these selected samples to calculate the nearest-neighbour isolation similarity of the sub-clusters. Then, adjacent sub-clusters can be merged according to the presented connecting strategy, and finally, C clusters are obtained. Our method proposed in this paper has been tested on 15 UCI benchmark datasets and a synthetic dataset. Experimental results show that our proposed method is suitable for clustering non-cluster distribution data, and the clustering results are better than those of the comparison methods with solid robustness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call