Abstract

Non-negative matrix factorization (NMF) is a powerful tool for dimensionality reduction and clustering. However, the interpretation of the clustering result from NMF is difficult, especially for the high-dimensional biological data without effective feature selection. To address this problem, we introduce a row-sparse NMF with <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _{2,0}$</tex-math></inline-formula> -norm constraint (NMF <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\_\ell _{20}$</tex-math></inline-formula> ), where the basis matrix <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\bm {W}$</tex-math></inline-formula> is constrained by using the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _{2,0}$</tex-math></inline-formula> -norm constraint such that <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\bm {W}$</tex-math></inline-formula> has a row-sparsity pattern with feature selection. However, it is a challenge to solve the model, because the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _{2,0}$</tex-math></inline-formula> -norm constraint is a non-convex and non-smooth function. Fortunately, we prove that the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _{2,0}$</tex-math></inline-formula> -norm constraint satisfies the Kurdyka-Łojasiewicz property. Based on this finding, we present a proximal alternating linearized minimization algorithm and its monotone accelerated version to solve the NMF <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\_\ell _{20}$</tex-math></inline-formula> model. In addition, we further present a orthogonal NMF with <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\ell _{2,0}$</tex-math></inline-formula> -norm constraint (ONMF <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\_\ell _{20}$</tex-math></inline-formula> ) to enhance the clustering performance by using a non-negative orthogonal constraint. The ONMF <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\_\ell _{20}$</tex-math></inline-formula> model is solved by transforming into a series of constrained and penalized matrix factorization problems. The convergence and guarantees for these proposed algorithms are proved and the computational complexity is well evaluated. The results on numerical and scRNA-seq datasets demonstrate the efficiency of our methods in comparison with existing methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.