Kernel ridge vs. principal component regression: Minimax bounds and the qualification of regularization operators

Lee H Dicker,Daniel Hsu,Dean P Foster

doi:10.1214/17-ejs1258

Lee H Dicker, Daniel Hsu + Show 1 more

Open Access

https://doi.org/10.1214/17-ejs1258

Copy DOI

Export

Save

Cite

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2017
Citations: 29	License type: cc-by

Abstract
Highlights/Summary
Full-Text
Similar Papers

Abstract

Listen

Regularization is an essential element of virtually all kernel methods for nonparametric regression problems. A critical factor in the effectiveness of a given kernel method is the type of regularization that is employed. This article compares and contrasts members from a general class of regularization techniques, which notably includes ridge regression and principal component regression. We derive an explicit finite-sample risk bound for regularization-based estimators that simultaneously accounts for (i) the structure of the ambient function space, (ii) the regularity of the true regression function, and (iii) the adaptability (or qualification) of the regularization. A simple consequence of this upper bound is that the risk of the regularization-based estimators matches the minimax rate in a variety of settings. The general bound also illustrates how some regularization techniques are more adaptable than others to favorable regularity properties that the true regression function may possess. This, in particular, demonstrates a striking difference between kernel ridge regression and kernel principal component regression. Our theoretical results are supported by numerical experiments.

Highlights

Suppose that the observed data consists of zi =, i = 1, . . . , n, where yi ∈ Y ⊆ R and xi ∈ X ⊆ Rd
F †(x) − f(x) 2 dρX (x) where the expectation is computed over z1, . . . , zn, and · ρX denotes the norm on L2(ρX ); we seek estimators fwhich minimize Rρ(f). This is a version of the random design nonparametric regression problem
We focus on regularization and kernel methods for estimating f †

Summary

Introduction

One consequence of the theorem is that the regularization methods studied in this paper (including KRR and KPCR) achieve the minimax rate for estimating f † in a variety of settings. A second consequence is that certain regularization methods (including KPCR, but not KRR) may adapt to favorable regularity of f † to attain even faster convergence rates, while others (notably KRR) are limited in this regard due to a well-known saturation effect (Neubauer, 1997; Mathe, 2005; Bauer et al, 2007) This illustrates a striking advantage that KPCR may have over KRR in these settings

Related work

Statistical setting and assumptions

Regularization

Finite-rank operators of interest

Basic definitions

Estimators

General bound on the risk

Implications for kernels characterized by their eigenvalues’ rate of decay

Parametric rates for finite-dimensional kernels and subspaces

Simulated data

Real data

Discussion

Bias-variance decomposition

Translation to vector and matrix notation

Bias bound

Variance bound

Finishing the proof of Theorem 1

Sums of random operators

Differences of powers of bounded operators

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Kernel ridge vs. principal component regression: Minimax bounds and the qualification of regularization operators

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

On Converse and Saturation Results for Tikhonov Regularization of Linear Ill-Posed Problems
Andreas Neubauer
SIAM Journal on Numerical Analysis | VOL. 34
Andreas NeubauerAndreas Neubauer
01 Apr 1997
SIAM Journal on Numerical Analysis | VOL. 34

Genetic robust kernel sample selection for chemometric data analysis
Fouzi Douak ... Mohamed Lamine Mekhalfi
Journal of Chemometrics | VOL. 35
Fouzi Douak, et. al.Fouzi Douak ... Mohamed Lamine Mekhalfi
14 Apr 2021
Journal of Chemometrics | VOL. 35

Nonlinear Monitoring and Prediction Model in an Industrial Environmental Process
Chang Kyoo Yoo
JOURNAL OF CHEMICAL ENGINEERING OF JAPAN | VOL. 41
Chang Kyoo YooChang Kyoo Yoo
01 Jan 2008
JOURNAL OF CHEMICAL ENGINEERING OF JAPAN | VOL. 41

Predicting dissolved oxygen concentration using kernel regression modeling approaches with nonlinear hydro-chemical data
Kunwar P Singh ... Premanjali Rai
Environmental Monitoring and Assessment | VOL. 186
Kunwar P Singh, et. al.Kunwar P Singh ... Premanjali Rai
14 Dec 2013
Environmental Monitoring and Assessment | VOL. 186

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Kernel ridge vs. principal component regression: Minimax bounds and the qualification of regularization operators

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: Electronic Journal of Statistics