Clustering has received a lot of attention and research in many important fields, such as machine learning and data mining. Especially, the clustering method based on non-negative matrix factorization (NMF) has been widely used. However, the following problems still exist. First, the clustering method based on the traditional NMF cannot handle noise and outliers well, although the reconstruction error can be measured by the l2,1-norm instead Frobenius norm to improve the robustness, the effect is not obvious. Second, NMF based on graph regularization mostly relies on initial similarity graph, the method of constructing the graph is fixed and cannot update the graph adaptively. Third, there is no sparse constraints both on the basis matrix and the coefficient matrix, which may cause important information to be ignored during clustering. Therefore, in order to solve the above problems, we propose a joint doubly stochastic matrix regularization and dual sparse coding framework (DSNMF). Specifically, we use correntropy instead of the Euclidean distance to overcome the influence of non-Gaussian noise and outliers, making NMF more robust. In addition, the adaptive graph learning can learn a high-quality graph through doubly stochastic matrix to fully maintain local smoothness. Furthermore, we perform sparse coding both on the basis matrix and the coefficient matrix to make full use of the sparsity of the matrix. Finally, the experimental results on eleven datasets show that our method is better than other methods in most cases, and our DSNMF is robust to noise and outliers.
Read full abstract