Abstract
High-dimensional linear regression model is the most popular statistical model for high-dimensional data, but it is quite a challenging task to achieve a sparse set of regression coefficients. In this paper, we propose a simple heuristic algorithm to construct sparse high-dimensional linear regression models, which is adapted from the shortest-solution guided decimation algorithm and is referred to as ASSD. This algorithm constructs the support of regression coefficients under the guidance of the shortest least-squares solution of the recursively decimated linear models, and it applies an early-stopping criterion and a second-stage thresholding procedure to refine this support. Our extensive numerical results demonstrate that ASSD outperforms LASSO, adaptive LASSO, vector approximate message passing, and two other representative greedy algorithms in solution accuracy and robustness. ASSD is especially suitable for linear regression problems with highly correlated measurement matrices encountered in real-world applications.
Highlights
High-dimensional linear regression model is the most popular statistical model for high-dimensional data, but it is quite a challenging task to achieve a sparse set of regression coefficients
We proposed the adaptive shortest-solution guided decimation (ASSD) algorithm to estimate highdimensional sparse linear regression models
Compared to the original SSD algorithm which is developed for linear regression models without n oise[32], the Adaptive shortest‐solution guided decimation (ASSD) algorithm takes into account the effect of measurement noise and adopts an early-stopping strategy and a second-stage thresholding procedure, resulting in significantly better performance in variables selection and coefficients estimation
Summary
High-dimensional linear regression model is the most popular statistical model for high-dimensional data, but it is quite a challenging task to achieve a sparse set of regression coefficients. Motivated by empirical findings in genomics and other fields, we usually assume that the high-dimensional regression models are sparse, in the sense that only a relatively small number of predictors are important for explaining the observed d ata[1]. Associated with this sparsity criterion are two highly nontrivial issues in high-dimensional linear regression: (1) variables selection, namely to specify the most relevant predictors; and (2) parameters or coefficients estimation, namely. Some alternative methods have been proposed, including multi-stages methods such as adaptive L ASSO10 and the three-stage method[11], and non-convex penalties such as the smoothly clipped absolute deviation (SCAD) penalty[12] and the minimax concave penalty (MCP)[13]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.