Abstract

Symmetric nonnegative matrix factorization (symNMF) is a variant of nonnegative matrix factorization (NMF) that allows handling symmetric input matrices and has been shown to be particularly well suited for clustering tasks. In this paper, we present a new model, dubbed off-diagonal symNMF (ODsymNMF), that does not take into account the diagonal entries of the input matrix in the objective function. ODsymNMF has three key advantages compared to symNMF. First, ODsymNMF is theoretically much more sound as there always exists an exact factorization of size at most n(n − 1)/2 where n is the dimension of the input matrix. Second, it makes more sense in practice as diagonal entries of the input matrix typically correspond to the similarity between an item and itself, not bringing much information. Third, it makes the optimization problem much easier to solve. In particular, it will allow us to design an algorithm based on coordinate descent that minimizes the component-wise ℓ1 norm between the input matrix and its approximation. We prove that this norm is much better suited for binary input matrices often encountered in practice. We also derive a coordinate descent method for the component-wise ℓ2 norm, and compare the two approaches with symNMF on synthetic and document datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call