Abstract

The most popular algorithms used in unsupervised learning are clustering algorithms. Clustering algorithms are used to group samples into a number of classes or clusters based on the distances of the given sample features. Therefore, how to define the distance between samples is important for the clustering algorithm. Traditional clustering algorithms are generally based on the Mahalanobis distance and Minkowski distance, which have difficulty dealing with set-based data and uncertain nonlinear data. To solve this problem, we propose the granular vectors relative distance and granular vectors absolute distance based on the neighborhood granule operation. Further, the neighborhood granular meanshift clustering algorithm is also proposed. Finally, the effectiveness of neighborhood granular meanshift clustering is proved from two aspects of internal metrics (Accuracy and Fowlkes–Mallows Index) and external metric (Silhouette Coeffificient) on multiple datasets from UC Irvine Machine Learning Repository (UCI). We find that the granular meanshift clustering algorithm has a better clustering effect than the traditional clustering algorithms, such as Kmeans, Gaussian Mixture and so on.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call