Abstract
We analyse a linear regression problem with nonconvex regularization called smoothly clipped absolute deviation (SCAD) under an overcomplete Gaussian basis for Gaussian random data. We propose an approximate message passing (AMP) algorithm considering nonconvex regularization, namely SCAD-AMP, and analytically show that the stability condition corresponds to the de Almeida–Thouless condition in spin glass literature. Through asymptotic analysis, we show the correspondence between the density evolution of SCAD-AMP and the replica symmetric (RS) solution. Numerical experiments confirm that for a sufficiently large system size, SCAD-AMP achieves the optimal performance predicted by the replica method. Through replica analysis, a phase transition between replica symmetric and replica symmetry breaking (RSB) region is found in the parameter space of SCAD. The appearance of the RS region for a nonconvex penalty is a significant advantage that indicates the region of smooth landscape of the optimization problem. Furthermore, we analytically show that the statistical representation performance of the SCAD penalty is better than that of -based methods, and the minimum representation error under RS assumption is obtained at the edge of the RS/RSB phase. The correspondence between the convergence of the existing coordinate descent algorithm and RS/RSB transition is also indicated.
Highlights
Variable selection is a basic and important problem in statistics, the objective of which is to find parameters that are significant for the description of given data as well as for the prediction of unknown data
The development of sparse estimation has been accelerated since the proposal of the least absolute shrinkage and selection operator (LASSO) [2], where variable selection is formulated as a convex problem of the minimization of the loss function associated with 1 regularization
When we restrict the smoothly clipped absolute deviation (SCAD) param eters to be within the replica symmetric (RS) region, the smallest representation error is obtained at the RS/replica symmetry breaking (RSB) boundary
Summary
Variable selection is a basic and important problem in statistics, the objective of which is to find parameters that are significant for the description of given data as well as for the prediction of unknown data. The sparse estimation approach for variable selection in high-dimensional statistical modelling has the advantages of high computational e ciency, stability, and the ability to draw sampling properties compared with traditional approaches that follow stepwise and subset selection procedures [1] It has been studied intensively in recent decades. The development of sparse estimation has been accelerated since the proposal of the least absolute shrinkage and selection operator (LASSO) [2], where variable selection is formulated as a convex problem of the minimization of the loss function associated with 1 regularization. The LASSO has many attractive properties, the shrinkage introduced by 1 regularization results in a significant bias toward 0 for regression coe cients To resolve this problem, nonconvex penalties have been proposed, such as the smoothly clipped absolute deviation (SCAD) penalty [3] and the minimax concave penalty (MCP) [4]. SCAD regularization is reduced to 1 regularization at a → ∞
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Statistical Mechanics: Theory and Experiment
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.