K-means clustering, which aims to distribute data points into K different clusters, is an effective technique for data mining and machine learning. However, several factors degrade clustering performance in real-world applications. High-dimensional data containing redundant features is one of the key issues in the clustering process. Many traditional hard or fuzzy versions of K-means clustering fail to handle high-dimensional data effectively. In this paper, we present a novel model referred to as discriminative fuzzy K-means clustering, which incorporates discriminative projection and p-Laplacian graph regularization into a joint framework. Specifically, the proposed model can reduce the negative impact of redundant features and enhance the discriminative power when projecting the original data into a lower-dimensional space. It simultaneously preserves the structural locality properties of the data by performing p-Laplacian graph regularization on the membership matrix. Consequently, clustering performance is significantly enhanced for high-dimensional data. Finally, an effective algorithm is presented to solve the formulated optimization problem. Experimental results obtained on benchmark datasets are promising and demonstrate the superiority of the proposed method.
Read full abstract