Clustering has been used in various fields, such as image processing, data mining, pattern recognition, and statistical analysis. Generally, clustering algorithms consider all variables equally relevant or not correlated. Nevertheless, the pattern of data samples in the multidimensional space can be geometrically complicated, e.g., clusters may exist in different subsets of features. In this regard, new soft subspace clustering algorithms have been proposed, in which the correlation and relevance of variables are considered to improve their performance. Since regularization-based methods are robust for initializations, the approaches proposed introduce an entropy regularization term for controlling the membership degree of the objects. Such regularizations are popular due to high performance in large-scale data clustering and low computational complexity. These three-step iterative algorithms provide a fuzzy partition, a representative for each cluster, and the relevance weight of the variables or their correlation by minimizing a suitable objective function. Several experiments on synthetic and real datasets, including their application to the segmentation of noisy image textures, demonstrate the usefulness of the proposed clustering methods.
Read full abstract