Abstract

here K is a kernel function (usually a symmetric probability density function) and h is a smoothing parameter or bandwidth, the value of which determines how much the data are smoothed to produce the density estimate. The first published paper specifically on density estimation was Rosenblatt (1956); however, density estimates were suggested several years earlier by Fix and Hodges (1951) in a technical report. It is regrettable that this report was never published, since it contains a great deal of interesting discussion and insight, much of which is still pertinent today. In these days of computer graphics and 'exploratory data analysis, it is often supposed that the primary purpose for which density estimation was ordained was as a method for producing pretty (perhaps too pretty) pictures from data. However, a glance at Fix and Hodges (1951) shows that this is not at all the case. Their interest in density estimation stemmed from the discrimination or classification problem of allocating an observation Z to one of two populations A and B. If the distributions underlying the populations A and B are not known but have to be estimated from data, and if the statistician is unwilling to make parametric assumptions of the kind tacitly present in Fisher's linear discriminant method, then a natural approach is to construct estimates of the densities fA and f B, and then to base a discriminant rule on the ratio of these two density estimates. Since these early papers, there has been a large amount of research into density estimation and related subjects. The main emphasis of much of this work has been to investigate the theoretical properties, and particularly the asymptotic properties, of various methods of density estimation. In certain quarters,

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call