Understanding efficiency in high dimensional linear models is a longstanding problem of interest. Classical work with smaller dimensional problems dating back to Huber and Bickel has illustrated the clear benefits of efficient loss functions. When the number of parameters $p$ is of the same order as the sample size $n$, $p\approx n$, an efficiency pattern different from the one of Huber was recently established. In this work, we study relative efficiency of sparsity linear models with $p\gg n$. In the interest of deriving the asymptotic mean squared error for $l_{1}$ regularized M-estimators, we propose a novel, robust and sparse approximate message passing algorithm (RAMP), that is adaptive to the error distribution. Our algorithm includes many non-quadratic and non-differentiable loss functions. We derive its asymptotic mean squared error and show its convergence, while allowing $p,n,s\to \infty$, with $n/p\in (0,1)$ and $n/s\in (1,\infty)$. We identify new patterns of relative efficiency regarding $l_{1}$ penalized $M$ estimators. We show that the classical information bound is no longer reachable, even for light–tailed error distributions. Moreover, we show new breakdown points regarding the asymptotic mean squared error. The asymptotic mean squared error of the $l_{1}$ penalized least absolute deviation estimator (P-LAD) breaks down at a critical ratio of the number of observations per number of sparse parameters in the case of light-tailed distributions; whereas, in the case of heavy-tailed distributions, the asymptotic mean squared error breaks down at a critical ratio of the optimal tuning parameter of P-LAD to the optimal tuning parameter of the $l_{1}$ penalized least square estimator.
Read full abstract