Abstract
AbstractBasis pursuit (BP), basis pursuit deNoising (BPDN), and least absolute shrinkage and selection operator (LASSO) are popular methods for identifying important predictors in the high‐dimensional linear regression model . By definition, when , BP uniquely recovers when and implies (identifiability condition). Furthermore, LASSO can recover the sign of only under a much stronger irrepresentability condition. Meanwhile, it is known that the model selection properties of LASSO can be improved by hard thresholding its estimates. This article supports these findings by proving that thresholded LASSO, thresholded BPDN, and thresholded BP recover the sign of in both the noisy and noiseless cases if and only if is identifiable and large enough. In particular, if X has iid Gaussian entries and the number of predictors grows linearly with the sample size, then these thresholded estimators can recover the sign of when the signal sparsity is asymptotically below the Donoho–Tanner transition curve. This is in contrast to the regular LASSO, which asymptotically, recovers the sign of only when the signal sparsity tends to 0. Numerical experiments show that the identifiability condition, unlike the irrepresentability condition, does not seem to be affected by the structure of the correlations in the X matrix.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.