Abstract

Fuzzy c-means (FCM) is a well-known and widely applied fuzzy clustering method. Although there have been considerable studies which focused on the selection of better fuzzifier values in FCM, there is still not one widely accepted criterion. Also, in practical applications, the distributions of many data sets are not uniform. Hence, it is necessary to understand the impact of cluster size distribution on the selection of fuzzifier value. In this paper, the coefficient of variation (CV) is used to measure the variation of cluster sizes in a data set, and the difference of coefficient of variation (DCV) is the change of variation in cluster sizes after FCM clustering. Then, considering that the fuzzifier value with which FCM clustering produces minor change in cluster variation is better, a criterion for fuzzifier selection in FCM is presented from cluster size distribution perspective, followed by a fuzzifier selection algorithm called CSD-m (cluster size distribution for fuzzifier selection) algorithm. Also, we developed an indicator called Influence Coefficient of Fuzzifier ($\mathit{ICF}$) to measure the influence of fuzzifier values on FCM clustering results. Finally, experimental results on 8 synthetic data sets and 4 real-world data sets illustrate the effectiveness of the proposed criterion and CSD-m algorithm. The results also demonstrate that the widely used fuzzifier value $m=2$ is not optimal for many data sets with large variation in cluster sizes. Based on the relationship between ${\mathit{CV}_{0}}$ and $\mathit{ICF}$, we further found that there is a linear correlation between the extent of fuzzifier value influence and the original cluster size distributions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.