Lower bounds for bandwidth selection in density estimation

Peter Hall,J S Marron

doi:10.1007/bf01192160

Abstract

This paper establishes asymptotic lower bounds which specify, in a variety of contexts, how well (in terms of relative rate of convergence) one may select the bandwidth of a kernel density estimator. These results provide important new insights concerning how the bandwidth selection problem should be considered. In particular it is shown that if the error criterion is Integrated Squared Error (ISE) then, even under very strong assumptions on the underlying density, relative error of the selected bandwidth cannot be reduced below ordern −1/10 (as the sample size grows). This very large error indicates that any technique which aims specifically to minimize ISE will be subject to serious practical difficulties arising from sampling fluctuations. Cross-validation exhibits this very slow convergence rate, and does suffer from unacceptably large sampling variation. On the other hand, if the error criterion is Mean Integrated Squared Error (MISE) then relative error of bandwidth selection can be reduced to ordern −1/2, when enough smoothness is assumed. Therefore bandwidth selection techniques which aim to minimize MISE can be much more stable, and less sensitive to small sampling fluctuations, than those which try to minimize ISE. We feel this indicates that performance in minimizing MISE, rather than ISE, should become the benchmark for measuring performance of bandwidth selection methods.

Full Text