Abstract
We study the problem of estimating high-dimensional regression models regularized by a structured sparsity-inducing penalty that encodes prior structural information on either the input or output variables. We consider two widely adopted types of penalties of this kind as motivating examples: (1) the general overlapping-group-lasso penalty, generalized from the group-lasso penalty; and (2) the graph-guided-fused-lasso penalty, generalized from the fused-lasso penalty. For both types of penalties, due to their nonseparability and nonsmoothness, developing an efficient optimization method remains a challenging problem. In this paper we propose a general optimization approach, the smoothing proximal gradient (SPG) method, which can solve structured sparse regression problems with any smooth convex loss under a wide spectrum of structured sparsity-inducing penalties. Our approach combines a smoothing technique with an effective proximal gradient method. It achieves a convergence rate significantly faster than the standard first-order methods, subgradient methods, and is much more scalable than the most widely used interior-point methods. The efficiency and scalability of our method are demonstrated on both simulation experiments and real genetic data sets.
Highlights
The problem of high-dimensional sparse feature learning arises in many areas in science and engineering
We call our approach a “smoothing” proximal gradient method because instead of optimizing the original objective function directly as in other proximal gradient methods, we introduce a smooth approximation to the structured sparsityinducing penalty using the technique from Nesterov (2005)
Let X ∈ RN×J denote the matrix of inputs of the N samples, where each sample lies in a J -dimensional space; and y ∈ RN×1 denote the vector of univariate outputs of the N sample
Summary
The problem of high-dimensional sparse feature learning arises in many areas in science and engineering. The structure over the outputs is available as prior knowledge, and the closely related outputs according to this structure are encouraged to share a similar set of relevant inputs These progresses notwithstanding, the development of efficient optimization methods for solving the estimation problems resultant from the structured sparsity-inducing penalty functions remains a challenge for reasons we will discuss below. In this paper we propose a generic optimization approach, the smoothing proximal gradient (SPG) method, for dealing with a broad family of sparsity-inducing penalties of complex structures. Throughout the paper, we will discuss overlapping-group-lasso and graph-guided-fused-lasso penalties in parallel to illustrate how the SPG can be used to solve the corresponding optimization problems generically
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have