Abstract

Abstract A generic out-of-sample error estimate is proposed for $M$-estimators regularized with a convex penalty in high-dimensional linear regression where $(\boldsymbol{X},\boldsymbol{y})$ is observed and the dimension $p$ and sample size $n$ are of the same order. The out-of-sample error estimate enjoys a relative error of order $n^{-1/2}$ in a linear model with Gaussian covariates and independent noise, either non-asymptotically when $p/n\le \gamma $ or asymptotically in the high-dimensional asymptotic regime $p/n\to \gamma ^{\prime}\in (0,\infty )$. General differentiable loss functions $\rho $ are allowed provided that the derivative of the loss is 1-Lipschitz; this includes the least-squares loss as well as robust losses such as the Huber loss and its smoothed versions. The validity of the out-of-sample error estimate holds either under a strong convexity assumption, or for the L1-penalized Huber M-estimator and the Lasso under a sparsity assumption and a bound on the number of contaminated observations. For the square loss and in the absence of corruption in the response, the results additionally yield $n^{-1/2}$-consistent estimates of the noise variance and of the generalization error. This generalizes, to arbitrary convex penalty and arbitrary covariance, estimates that were previously known for the Lasso.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call