Abstract

We consider truncated SVD (or spectral cut-off, projection) estimators for a prototypical statistical inverse problem in dimension $D$. Since calculating the singular value decomposition (SVD) only for the largest singular values is much less costly than the full SVD, our aim is to select a data-driven truncation level $\widehat{m}\in \{1,\ldots ,D\}$ only based on the knowledge of the first $\widehat{m}$ singular values and vectors. We analyse in detail whether sequential early stopping rules of this type can preserve statistical optimality. Information-constrained lower bounds and matching upper bounds for a residual based stopping rule are provided, which give a clear picture in which situation optimal sequential adaptation is feasible. Finally, a hybrid two-step approach is proposed which allows for classical oracle inequalities while considerably reducing numerical complexity.

Highlights

  • Introduction and overview of resultsA classical model for statistical inverse problems is the observation of Y = Aμ + δW (1.1)where A : H1 → H2 is a linear, bounded operator between real Hilbert spaces H1, H2, μ ∈ H1 is the signal of interest, δ > 0 is the noise level and Wis a Gaussian white noise in H2, see e.g. Bissantz et al [1], Cavalier [5] and the references therein

  • We investigate the possibility of an approach which is both statistically efficient and sequential along the singular value decomposition (SVD) in the following sense: we aim at early stopping methods, in which the truncated SVD estimators μ(m) for m = 0, 1, . . . , are computed iteratively, a stopping rule decides to stop at some step m and μ(m) is used as the estimator

  • In order to have clearer oracle inequalities, we work with continuous oracletype truncation indices in [0, D]

Read more

Summary

Introduction and overview of results

Typical methods use (generalized) cross validation, see e.g. Wahba [19], unbiased risk estimation, see e.g. Cavalier et al [6], penalized empirical risk minimisation, see e.g. Cavalier and Golubev [7], or Lepski’s balancing principle for inverse problems, see e.g. Mathe and Pereverzev [15] They all share the drawback that the estimators μ(m) have first to be computed for all values of 0 ≤ m ≤ D, and be compared to each other in some way. Minimax optimal solutions along the iteration path have been identified in different settings, see e.g. Yao et al [20] for gradient descent learning, Blanchard and Mathe [3] for conjugate gradients, Raskutti, Wainwright and Yu [16] for (reproducing) kernel learning and Buhlmann and Hothorn [4] for the application to L2-boosting All these methods stop at a fixed iteration step, depending on the prior knowledge of the smoothness of the unknown solution. A hybrid two-step procedure enjoys full adaptivity for the truncated SVD-method

Non-asymptotic oracle approach
Setting for asymptotic considerations
Overview of results
The frequency filtration
Residual filtration
Upper bounds
Upper bounds in weak norm
Upper bounds in strong norm
Construction and results
Numerical illustration
Findings
A total variation bound for non-central χ2-laws
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.