Abstract

We consider high-dimensional sparse regression problems in which we observe $\mathbf{y}=\mathbf{X}\boldsymbol{\beta} +\mathbf{z}$, where $\mathbf{X}$ is an $n\times p$ design matrix and $\mathbf{z}$ is an $n$-dimensional vector of independent Gaussian errors, each with variance $\sigma^{2}$. Our focus is on the recently introduced SLOPE estimator [Ann. Appl. Stat. 9 (2015) 1103–1140], which regularizes the least-squares estimates with the rank-dependent penalty $\sum_{1\le i\le p}\lambda_{i}|\widehat{\beta} |_{(i)}$, where $|\widehat{\beta} |_{(i)}$ is the $i$th largest magnitude of the fitted coefficients. Under Gaussian designs, where the entries of $\mathbf{X}$ are i.i.d. $\mathcal{N}(0,1/n)$, we show that SLOPE, with weights $\lambda_{i}$ just about equal to $\sigma\cdot\Phi^{-1}(1-iq/(2p))$ [$\Phi^{-1}(\alpha)$ is the $\alpha$th quantile of a standard normal and $q$ is a fixed number in $(0,1)$] achieves a squared error of estimation obeying \[\sup_{\|\boldsymbol{\beta} \|_{0}\le k}\mathbb{P} (\|\widehat{\boldsymbol {\beta}}_{\mathrm{SLOPE}}-\boldsymbol{\beta} \|^{2}>(1+\varepsilon) 2\sigma^{2}k\log(p/k))\longrightarrow 0\] as the dimension $p$ increases to $\infty$, and where $\varepsilon >0$ is an arbitrary small constant. This holds under a weak assumption on the $\ell_{0}$-sparsity level, namely, $k/p\rightarrow 0$ and $(k\log p)/n\rightarrow 0$, and is sharp in the sense that this is the best possible error any estimator can achieve. A remarkable feature is that SLOPE does not require any knowledge of the degree of sparsity, and yet automatically adapts to yield optimal total squared errors over a wide range of $\ell_{0}$-sparsity classes. We are not aware of any other estimator with this property.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call