Abstract

An exact and easily computable expression for the mean integrated squared error (MISE) for the kernel estimator of a general normal mixture density, is given for Gaussian kernels of arbitrary order. This provides a powerful new way of understanding density estimation which complements the usual tools of simulation and asymptotic analysis. The family of normal mixture densities is very flexible and the formulae derived allow simple exact analysis for a wide variety of density shapes. A number of applications of this method giving important new insights into kernel density estimation are presented. Among these is the discovery that the usual asymptotic approximations to the MISE can be quite inaccurate, especially when the underlying density contains substantial fine structure and also strong evidence that the practical importance of higher order kernels is surprisingly small for moderate sample sizes. 1. Introduction. Substantial research has been devoted to kernel density estimation. This is because it provides a simple, yet appealing, context in which to study problems and issues that arise in all types of nonparametric curve estimation. This includes regression, spectral density and hazard estimation, and also a variety of other estimators, including histograms, splines and orthogonal series. Three important and useful tools for understanding the behavior of nonparametric curve estimators are asymptotic analysis, simulation and numerical calculation of error criteria. Each of these methods provides many useful insights into the complicated structure present in the study of curve estimation. However, each has its limitations as well. rhe strength of asymptotic analysis is that it frequently allows simultaneous study of many different specific examples, through general results applying to entire classes of settings. The weakness of asymptotics is that they only describe behavior in the limit. This is still very useful in many situations because the asymptotics describe the actual situation quite well. However, it is less useful when the asymptotics have not yet kicked in (that is, in studying situations where the asymptotically dominant effect has not taken over yet). Perhaps the biggest drawback to asymptotics is that it is very difficult to determine in a given situation which of these possibilities is occurring.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.