Abstract

In this chapter, we discuss univariate density estimation. We are well aware that most interesting applications are with respect to multivariate data, but we must first discuss univariate density estimation so that we can generalize the univariate methods to the multivariate setting. That being said, there are many interesting applications of univariate densities in economics as well as in other fields. The most common application of univariate density estimation in the context of economic growth is the examination of output-per-worker distributions, most commonly attributed to Quah (1993a,b). We start by considering nonparametric density estimation in the crudest possible way: a histogram. In fact, histograms are nonparametric in nature and can show information that may be hidden (e.g., multimodality) when assuming a pre-specified parametric density (say Gaussian). However, it is well known that the shape of the histogram is determined by the pre-specified binwidth, and hence we spend a large amount of time discussing this “smoothing” parameter. Another problem with this type of approach is that the density is discontinuous, and hence we cannot view gradients and other interesting pieces of information. Although the estimator is simplistic, the discussion of this type of density estimation leads quite naturally to our preferred choice of a smooth, continuous estimator: the kernel density estimator. Although we focus solely on kernel methods in this book, this does not imply that they are necessarily the best methods for any particular problem or data set. However, these methods are arguably the most popular in the literature, and as such they are the most developed. Here we will discuss how to construct such estimators as well as examine their theoretical properties. For example, we typically give the bias, variance, and mean square error of each estimator. Although this can be found in several existing books, we pay special attention to the choice of kernel function, the role and selection of the smoothing (bandwidth) parameter, and the sample size. Given that we have a specific focus on the application of these methods, we attempt to clarify what is necessary for theoretical purposes and what is necessary for application.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.