Abstract

Without sufficient prior knowledge the identification of the optimal cluster numbers is a difficult problem for unsupervised clustering. Since fuzzy entropy is essential for measuring the information of fuzzy sets, a combined fuzzy entropy index (CFE) is developed for searching the best number of clusters kb. The CFE involves the compactness and the separation of clusters both in the data space and in the membership space. The partition of fuzzy membership sets evaluated by the ratio of the symmetric fuzzy cross entropy of membership subset pairs to the average of fuzzy entropies of clusters. The most appropriate number of clusters for a specific data set is determined by the maximum of the CFE index. In order to verify the effectiveness of the CFE in the search of kb, six artificial data sets and eight real data sets were used in the fuzzy c-means clustering. The results show the CFE index has superior performance in the estimation of the best partition of clusters than the indices PC, PE, MPC, XB, FS, Kwon, FHV and PBMF, especially for high dimensional datasets. Moreover, the CFE index can correctly find the kb for the data sets with overlapping clusters, subclusters, multi-clusters, or various density clusters.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call