Abstract
Non-negative matrix factorization (NMF) known as learnt parts-based representation has become a data analysis tool for clustering tasks. It provides an alternative learning paradigm to cope with non-negative data clustering. In this paradigm, concept factorization (CF) and symmetric non-negative factorization (SymNMF) are two typically important representative models. In general, they have distinct behaviors: in CF, each cluster is modeled as a linear combination of samples, and vice versa, i.e., sample reconstruction, while SymNMF built on pair-wise sample similarity measure, is to preserve similarity of samples in a low-dimensional subspace, namely similarity reconstruction. In this paper, we propose a similarity-based concept factorization (SCF) as a synthesis of the two behaviors. This design can be formulated as: the similarity of reconstructed samples by CF is close to that of original samples. To optimize it, we develop an optimization algorithm which leverages the alternating direction of multipliers (ADMM) method to solve each sub-problem of SCF. Besides, we take a further step to consider the robust issue of similarity reconstruction and explore a robust SCF model (RSCF), which penalizes the hardest pair-wise similarity reconstruction via $l_\infty $ . Thus, RSCF enjoys similarity preservation, robustness to similarity perturbation, and ability of reconstructing samples. Extensive experiments validate such properties and show that the proposed SCF and RSCF achieve large performance gains as compared to their counterparts.
Highlights
Non-negative matrix factorization (NMF, [1], [2]) approximates a non-negative data matrix by a product of two lowrank non-negative matrices, namely, the basis matrix and the coefficient matrix
A representative method is concept factorization (CF, [10]), which assumes that each cluster is modeled as a linear combination of samples, and each sample is modeled as a linear combination of clusters
Some facts can be obtained from the shown result in figures and tables as following: 1) Our robust SCF model (RSCF) achieve the satisfactory performance against other robust methods
Summary
Non-negative matrix factorization (NMF, [1], [2]) approximates a non-negative data matrix by a product of two lowrank non-negative matrices, namely, the basis matrix and the coefficient matrix. The coefficients learned in the low-dimensional subspace have the same similarity relationships among original samples For sake of this character, SymNMF has achieved sound clustering performance. Kang et al [13] proposed a self-tuning similarity learning method relaying on the kernel-based embedding, called SLKE They argued that similarity preserving could help the new representation relay on the overall relations among the data. To achieve the robust issue about similarity measurement, we introduce l∞ norm instead of L2 norm to measure the distortion levels of similarity reconstruction errors, which considers the similarity reconstruction in the worst-case assumption This way complements existing robust NMF methods. This study has the following contributions: 1) A concept factorization method based on similarity reconstruction is proposed, called SCF. More detail can be found in [20]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have