Recovery of simultaneous low rank and two-way sparse coefficient matrices, a nonconvex approach

Ming Yu,Varun Gupta,Mladen Kolar

doi:10.1214/19-ejs1658

Abstract

We study the problem of recovery of matrices that are simultaneously low rank and row and/or column sparse. Such matrices appear in recent applications in cognitive neuroscience, imaging, computer vision, macroeconomics, and genetics. We propose a GDT (Gradient Descent with hard Thresholding) algorithm to efficiently recover matrices with such structure, by minimizing a bi-convex function over a nonconvex set of constraints. We show linear convergence of the iterates obtained by GDT to a region within statistical error of an optimal solution. As an application of our method, we consider multi-task learning problems and show that the statistical error rate obtained by GDT is near optimal compared to minimax rate. Experiments demonstrate competitive performance and much faster running speed compared to existing methods, on both simulations and real data sets.

Highlights

Many problems in machine learning, statistics and signal processing can be formulated as optimization problems with a smooth objective and nonconvex constraints
We show that the statistical error nearly matches the optimal minimax rate, while the algorithm achieves the best performance in terms of estimation and prediction error in simulations
We proposed a new gradient descent with hard thresholding (GDT) algorithm to efficiently solve for optimization problem with simultaneous low rank and row and/or column sparsity structure on the coefficient matrix

Summary

Introduction

Many problems in machine learning, statistics and signal processing can be formulated as optimization problems with a smooth objective and nonconvex constraints. Compared to the existing work for optimization over low rank matrices with (alternating) gradient descent, we need to study a projection onto a nonconvex set in each iteration, which in our case is a hard-thresholding operation, that requires delicate analysis and novel theory. Our algorithm does not require a new independent sample in each iteration and allows for non-Gaussian errors, while at the same time achieves nearly optimal error rate compared to the information theoretic minimax lower bound for the problem. Our proposed algorithm can be applied to the regression step of any MTRL algorithm (we chose Fitted Q-iteration (FQI) for presentation purposes) to solve for the optimal policies for MDPs. Compared to [26] which uses convex relaxation, our algorithm is much more efficient in high dimensions

Related work

Organization of the paper

Gradient descent with hard thresholding

2: Parameters

11: Output

Theoretical result

Regularity conditions

Main result

Proof sketch of Theorem 1

Application to multi-task learning

GDT for multi-task learning

Application to multi-task reinforcement learning

Synthetic datasets

Norwegian paper quality dataset

Calcium imaging data

Proof of Lemma 2

Proof of Lemma 3

Proof of Lemma 5

USU VSV

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic journal of statistics	Publication Date: Jan 1, 2020
Citations: 5	License type: cc-by

R Discovery Prime

R Discovery Prime

Recovery of simultaneous low rank and two-way sparse coefficient matrices, a nonconvex approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic journal of statistics

Lead the way for us

Similar Papers

The Optimal Hard Threshold for Singular Values is <inline-formula> <tex-math notation="TeX">\(4/\sqrt {3}\) </tex-math></inline-formula>
Matan Gavish ... David L Donoho
IEEE Transactions on Information Theory | VOL. 60
Matan Gavish, et. al.Matan Gavish ... David L Donoho
01 Aug 2014
IEEE Transactions on Information Theory | VOL. 60

Performance guarantees for Schatten-p quasi-norm minimization in recovery of low-rank matrices
Mohammadreza Malek-Mohammadi ... Mikael Skoglund
Signal processing | VOL. 114
Mohammadreza Malek-Mohammadi, et. al.Mohammadreza Malek-Mohammadi ... Mikael Skoglund
09 Mar 2015
Signal processing | VOL. 114

Compressed sensing and robust recovery of low rank matrices
M Fazel ... P Parrilo
-
M Fazel, et. al.M Fazel ... P Parrilo
01 Oct 2008
01 Oct 2008

Stable recovery of low rank matrices from nuclear norm minimization
Hui-Min Wang ... Song Li
Acta Mathematicae Applicatae Sinica, English Series | VOL. 31
Hui-Min Wang, et. al.Hui-Min Wang ... Song Li
01 Jan 2015
Acta Mathematicae Applicatae Sinica, English Series | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Recovery of simultaneous low rank and two-way sparse coefficient matrices, a nonconvex approach

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic journal of statistics