Abstract

The separable nonnegative matrix factorization (NMF) model has been adopted in many applications (e.g., topic modeling and community detection), as it ensures identifiability of the latent factors with polynomial time algorithms under reasonable assumptions. Separable NMF is often formulated as a self-dictionary (SD) sparse regression problem, and algorithms under this formulation can be categorized into two classes, namely, greedy methods and all-at-once convex optimization. The latter has been shown to be more resilient to noise, but has serious scalability issues. Particularly, existing convex optimization algorithms for the SD sparse regression problem cost a memory that grows quadratically with respect to the data size-making it hardly useful in practice. This work puts forth a Frank-Wolfe algorithm to tackle this challenge. The proposed method has a memory cost that grows linearly in the data size-with provable guarantees. To our best knowledge, this is the first linear memory algorithm for this long-existing problem. Synthetic and real data experiments are used to showcase the memory efficiency of the proposed algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call