Abstract

Clustering validity function is an index used to judge the accuracy of clustering results. At present, most studies on clustering validity are based on single clustering validity function. Research shows that no clustering validity function can handle any data and always perform better than other indexes. Therefore, a hybrid weighted combination evaluation method based on fuzzy C-means (FCM) clustering validity functions was proposed. The weighting method combines expert weighting with information entropy weighting to improve the subjective factor influence of expert weighting and the shortcoming of information entropy weighting in the value judgment of each clustering validity function. Four clustering validity function combination methods of linear, exponential, logarithm and proportion was studied. Finally, the proposed fuzzy clustering validity evaluation method is verified by experiments on artificial data sets and UCI data sets. The experimental results show that the proposed fuzzy clustering validity evaluation method can overcome the shortcoming of single clustering validity function, and can get the optimal clustering number more accurately for different data sets.

Highlights

  • Clustering analysis, as an unsupervised learning method, divides the data without prior knowledge into similar samples and dissimilar samples with different categories, so as to make the samples of the same category as similar as possible and the samples of different categories as different as possible [1]

  • Based on weighted sum validity function (WSVF), FWSVF, weighted sum type of cluster validity index (WSCVI) and dynamic weighted sum validity function (DWSVF), this paper proposes a hybrid weighted fuzzy C-means clustering effectiveness function combination evaluation method (HWCVF)

  • The experimental results show that the proposed fuzzy clustering validity evaluation method HWCVF can overcome the shortcoming of single clustering validity function well, and can get the optimal clustering number more accurately for different data sets, which provides a new solution for the study of fuzzy clustering validity

Read more

Summary

INTRODUCTION

Clustering analysis, as an unsupervised learning method, divides the data without prior knowledge into similar samples and dissimilar samples with different categories, so as to make the samples of the same category as similar as possible and the samples of different categories as different as possible [1]. Most of the data in reality is uncertain, so based on the hard clustering method, Ruspini introduces the concept of fuzzy theory [3] to propose the fuzzy clustering, such as FCM clustering algorithm [4]. Based on WSVF, FWSVF, WSCVI and DWSVF, this paper proposes a hybrid weighted fuzzy C-means clustering effectiveness function combination evaluation method (HWCVF). The experimental results show that the proposed fuzzy clustering validity evaluation method HWCVF can overcome the shortcoming of single clustering validity function well, and can get the optimal clustering number more accurately for different data sets, which provides a new solution for the study of fuzzy clustering validity. The clustering validity functions based on membership degree only lack the connection with the geometric structure of the data set, so the final results are often onesided and the accuracy needs to be improved. 1 c c i=1 vi − v 2, and median vi − vk 2 represents the median of the distance between the two centers of the cluster

VALIDITY COMBINATION EVALUATION METHOD
SIMULATION EXPERIMENTS AND RESULT ANALYSIS
SELECTION OF EXPERIMENTS DATA
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call