Abstract

We consider the problem of aggregating a general collection of affine estimators for fixed design regression. Relevant examples include some commonly used statistical estimators such as least squares, ridge and robust least squares estimators. Dalalyan and Salmon [DS12] have established that, for this problem, exponentially weighted (EW) model selection aggregation leads to sharp oracle inequalities in expectation, but similar bounds in deviation were not previously known. While results [DRZ12] indicate that the same aggregation scheme may not satisfy sharp oracle inequalities with high probability, we prove that a weaker notion of oracle inequality for EW that holds with high probability. Moreover, using a generalization of the newly introduced $Q$-aggregation scheme we also prove sharp oracle inequalities that hold with high probability. Finally, we apply our results to universal aggregation and show that our proposed estimator leads simultaneously to all the best known bounds for aggregation, including $\ell_{q}$-aggregation, $q\in(0,1)$, with high probability.

Highlights

  • In the Gaussian Mean Model (GMM), we observe a Gaussian random vector Y ∈ Rn such that Y ∼ N (μ, σ2In) where the mean μ ∈ Rn is unknown and the variance parameter σ2 is known

  • Of the variety of methods and results dedicated to the GMM, Nemirovski [JN00, Nem00] introduced aggregation theory as a versatile tool for adaptation in nonparametric estimation [Lec07,RT07,Yan04], and more recently in high dimensional regression [LB06, RT11, DS12]

  • This is the framework of pure aggregation under which most of the developments have been made starting from the seminal works on aggregation [JN00, Nem[00], Tsy03]

Read more

Summary

Introduction

In the Gaussian Mean Model (GMM), we observe a Gaussian random vector Y ∈ Rn such that Y ∼ N (μ, σ2In) where the mean μ ∈ Rn is unknown and the variance parameter σ2 is known. That holds both in expectation and with high probability, where Tr(Aj ) denotes the trace of Aj. We continue by proving in Section 2.2 that for any ε > 0, there exists a choice of the temperature parameter for which the better known aggregate μEW based on exponential weights satisfies a weak oracle inequality that holds with high probability μEW − μ 2 ≤ min j∈[M ]. Such an inequality completes the sharp oracle inequality of [DS12] that holds in expectation.

Aggregation of affine estimators
Sharp oracle inequalities using Q-aggregation
Weak oracle inequality using exponential weights
Sparsity pattern aggregation
Universal aggregation
Proof of Theorem 1
Proof of Theorem 2
Proof of Theorem 3
Proof of Theorem 5
Decay of coefficients on lq-balls
Deviations of a χ2 distribution

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.