A Framework for Regularized Non-Negative Matrix Factorization, with Application to the Analysis of Gene Expression Data

Leo Taslaman,Björn Nilsson

doi:10.1371/journal.pone.0046331

Leo Taslaman, Björn Nilsson

Open Access

https://doi.org/10.1371/journal.pone.0046331

Copy DOI

Journal: PLoS ONE	Publication Date: Nov 2, 2012
Citations: 75	License type: CC BY 4.0

Affiliation: Lund University, Broad Institute

Abstract

Non-negative matrix factorization (NMF) condenses high-dimensional data into lower-dimensional models subject to the requirement that data can only be added, never subtracted. However, the NMF problem does not have a unique solution, creating a need for additional constraints (regularization constraints) to promote informative solutions. Regularized NMF problems are more complicated than conventional NMF problems, creating a need for computational methods that incorporate the extra constraints in a reliable way. We developed novel methods for regularized NMF based on block-coordinate descent with proximal point modification and a fast optimization procedure over the alpha simplex. Our framework has important advantages in that it (a) accommodates for a wide range of regularization terms, including sparsity-inducing terms like the penalty, (b) guarantees that the solutions satisfy necessary conditions for optimality, ensuring that the results have well-defined numerical meaning, (c) allows the scale of the solution to be controlled exactly, and (d) is computationally efficient. We illustrate the use of our approach on in the context of gene expression microarray data analysis. The improvements described remedy key limitations of previous proposals, strengthen the theoretical basis of regularized NMF, and facilitate the use of regularized NMF in applications.

Highlights

Given a data matrix A of size m6n, the aim of negative matrix factorization (NMF) is to find a factorization A~WHT where W is a non-negative matrix of size m6k, H is a non-negative matrix of size n6k, and k is the number of components in the model
While prior knowledge can be expressed in different ways, the extra constraints often take the form of regularization constraints that promote qualities like sparseness, smoothness, or specific relationships between components [13]
We consider a general formulation of regularized NMF where one factor is regularized, the scale of the solution is controlled exactly, and the choice of regularization term still open

Summary

Introduction

Given a data matrix A of size m6n, the aim of NMF is to find a factorization A~WHT where W is a non-negative matrix of size m6k (the component matrix), H is a non-negative matrix of size n6k (the mixing matrix), and k is the number of components in the model. Because exact factorizations do not always exist, common practice is to compute an approximate factorization by minimizing a relevant loss function, typically minimize EA{WHT E2F ð1Þ subject to W , H§0, where E:EF is the Frobenius norm. An important property is that the factorization is not unique, as every invertible matrix S satisfying WS§0 and S{1HT §0 will yield another non-negative factorization (WS)(S{1HT ) of the same matrix as WHT (simple examples of S matrices include diagonal re-scaling matrices) [14]. To reduce the problem of nonuniqueness, additional constraints can be included to find solutions that are likely to be informative/relevant with respect to problemspecific prior knowledge. While prior knowledge can be expressed in different ways, the extra constraints often take the form of regularization constraints (regularization terms) that promote qualities like sparseness, smoothness, or specific relationships between components [13]. The computational problem becomes more complicated, creating a need for computation methods that are capable of handling the regularization constraints in a robust and reliable way

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Framework for Regularized Non-Negative Matrix Factorization, with Application to the Analysis of Gene Expression Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Efficient Nonnegative Matrix Factorization by DC Programming and DCA.
Hoai An Le Thi ... Xuan Thanh Vo
Neural computation | VOL. 28
Hoai An Le Thi, et. al.Hoai An Le Thi ... Xuan Thanh Vo
03 May 2016
Neural computation | VOL. 28

Convergence of proximal algorithms with stepsize controls for non-linear inverse problems and application to sparse non-negative matrix factorization
Quy Muoi Pham ... Delf Lachmund
Numerical Algorithms | VOL. 85
Quy Muoi Pham, et. al.Quy Muoi Pham ... Delf Lachmund
02 Mar 2020
Numerical Algorithms | VOL. 85

MADE4: an R package for multivariate analysis of gene expression data
A C Culhane ... D G Higgins
Bioinformatics | VOL. 21
A C Culhane, et. al.A C Culhane ... D G Higgins
29 Mar 2005
Bioinformatics | VOL. 21

Improved non-negative factorization in the analysis of gene expression data
Jin Zhang ... Jiajun Wang
-
Jin Zhang, et. al. Jin Zhang ... Jiajun Wang
01 Jun 2008
01 Jun 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Framework for Regularized Non-Negative Matrix Factorization, with Application to the Analysis of Gene Expression Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE