Fuzzy K-Means with Variable Weighting in High Dimensional Data Analysis

Qiang Wang,Joshua Zhexue Huang,Yunming Ye

doi:10.1109/waim.2008.50

Abstract

This paper presents a comparison study of the fuzzy k-means algorithm and a new variant with variable weighting in clustering high dimensional data. The fuzzy k-means algorithm is effective in discovering the clusters with overlapping boundaries. However, this effectiveness can be handicapped in high dimensional data. The recent development of the k-means algorithm with automated variable weighting offers a new technique for dealing with high dimensional data that occurs in many new applications such as text mining and bioinformatics. In this paper, the variable weighting mechanism is incorporated in the fuzzy k-means algorithm to cluster high dimensional data with overlapping clusters. Experiments on real data sets have shown that the variable weighting fuzzy k-means produced better clustering results than the fuzzy k-means without variable weighting.

Full Text