Abstract

In conventional fuzzy C-means clustering algorithms, each data and each feature are treated equally, the clustering performance is sensitive to the noise points; in existing weighting clustering algorithms, few studies have focus on data weighting and feature weighting simultaneously, besides, the same data in different clusters is treated equally. To address this issue, in this paper, taking the different data weights in different clusters and the different feature weights in different clusters into consideration, we present a new robust fuzzy C-means clustering framework. For the first time, we propose a whole new idea that the same data in different clusters should have different importance, the different data in a cluster should have different importance, as well; By the new data weighting method, the proposed clustering algorithm can weaken the impact of noise points on the formation of each clustering center, which could enhance the robustness of clustering; to stimulate more data and more features to take part in the process of clustering and to avoid overfitting, we add l2-norm regularization of the data weights and l2-norm regularization of the feature weights to the objective function. Then, based on the presented objective function, we get the scientific update rules of the different data weights in different clusters, the different feature weights in different clusters, the membership degrees, and the cluster centers, during each iteration. To assess the performance of the new fuzzy C-means framework, experimental verifications on synthetic dataset and real-world datasets are conducted, experimental results have shown that the new algorithm can achieve better clustering performances in comparison to other related clustering methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call