Abstract

Finding the appropriate number of clusters in the absence of prior information is a hard and sensitive problem in clustering and data analysis. In this paper, we present a new cluster validity index (CVI) called H F able to find the optimal number of clusters present in a given data. The HF index is based on the membership partition. It can be seen as the generalisation of the Wu-and-Li (WL) and Tang (T) indices. Its particularity is the integration of a generalised ad-hoc punishing term, on the one hand, and the involving of median between centroids multiplied by the average of data per cluster for computing the separation, on the other hand. These contributions allow avoiding the monotony from which suffer the majority of CV I s and obtaining a precise evaluation. The optimal number of clusters C op corresponds to the minimum of the HF index. In order to ensure the effective choice of the optimal number of clusters, we propose an algorithm based on the HF and WL indices. The performance of the proposed index and algorithm are demonstrated through different experimentations on images clustering using the algorithm Fuzzy-C-Means (FCM). The HF index's ability to appropriately determine the number of clusters is compared with those of WL, T and the Xi-Beni (XB) indices with different initialisations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.