Abstract

In a linear regression model of fixed dimension $p \leq n$, we construct confidence regions for the unknown parameter vector based on the Lasso estimator that uniformly and exactly hold the prescribed in finite samples as well as in an asymptotic setup. We thereby quantify estimation uncertainty as well as the "post-model selection error" of this estimator. More concretely, in finite samples with Gaussian errors and asymptotically in the case where the Lasso estimator is tuned to perform conservative model selection, we derive exact formulas for computing the minimal coverage probability over the entire parameter space for a large class of shapes for the confidence sets, thus enabling the construction of valid confidence regions based on the Lasso estimator in these settings. The choice of shape for the confidence sets and comparison with the confidence ellipse based on the least-squares estimator is also discussed. Moreover, in the case where the Lasso estimator is tuned to enable consistent model selection, we give a simple confidence region with minimal coverage probability converging to one. Finally, we also treat the case of unknown error variance and present some ideas for extensions.

Highlights

  • The Lasso estimator as introduced in Tibshirani (1996) as well as many variants thereof have gained strong interest in the statistics community and in applied areas over the past two decades

  • We provide an example for p = 2 illustrating the difference between the confidence ellipse based on the LS estimator and the one based on the Lasso, as well as how to choose a better shape in terms of volume for the confidence set based on the Lasso estimator

  • We provide exact formulas for the minimal coverage probability of these regions in finite samples and asymptotically in a low-dimensional framework when the estimator is tuned to perform conservative model selection

Read more

Summary

Introduction

The Lasso estimator as introduced in Tibshirani (1996) as well as many variants thereof have gained strong interest in the statistics community and in applied areas over the past two decades. Potscher and Leeb (2009) give a detailed analysis in the framework of a linear regression model with orthogonal design and derive the distribution of the Lasso estimator in finite samples as well as in the two asymptotic regimes of consistent and conservative tuning. Implications of these results for confidence intervals are analyzed in Potscher and Schneider (2010) and generalizations to a moderate-dimensional setting where p ≤ n but p diverging with n are contained in Potscher and Schneider (2011) and Schneider (2016)

Setting and assumptions
Finite-sample results
Constructing the confidence set
Extensions and further considerations
Unknown error variance
Coverage probabilities over the parameter space
Inference on single components
Asymptotic framework
Conservative tuning
Consistent tuning
Summary and conclusion
Proofs for Section 3
Proofs for Section 4
Proofs for Section 5
Proofs for Section 6

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.