Abstract
We describe the R <b>np</b> package via a series of applications that may be of interest to applied econometricians. The np package implements a variety of nonparametric and semiparametric kernel-based estimators that are popular among econometricians. There are also procedures for nonparametric tests of significance and consistent model specification tests for parametric mean regression models and parametric quantile regression models, among others. The <b>np</b> package focuses on kernel methods appropriate for the mix of continuous, discrete, and categorical data often found in applied settings. Data-driven methods of bandwidth selection are emphasized throughout, though we caution the user that data-driven bandwidth selection methods can be computationally demanding.
Highlights
Devotees of R (R Development Core Team 2008) are likely to be aware of a number of nonparametric kernel1 smoothing methods that exist in R base and in certain R packages
One approach towards handling the presence of both continuous and categorical data is called a ‘frequency’ approach, whereby data is broken up into subsets (‘cells’) corresponding to the values assumed by the categorical variables, and only do you apply say density or locpoly to the continuous data remaining in each cell
The np package offers users of R a variety of nonparametric and semiparametric kernel-based methods that are capable of handling the mix of categorical and continuous data typically encountered by applied researchers
Summary
Devotees of R (R Development Core Team 2008) are likely to be aware of a number of nonparametric kernel smoothing methods that exist in R base (e.g., density) and in certain R packages (e.g., locpoly in the KernSmooth package Wand and Ripley 2008). In applied settings we often encounter a combination of categorical and continuous datatypes Those familiar with traditional nonparametric kernel smoothing methods will appreciate that. Recent theoretical developments offer practitioners a variety of kernel-based methods for categorical data only (i.e., unordered and ordered factors), or for a mix of continuous and categorical data These methods have the potential to recapture the efficiency losses associated with nonparametric frequency approaches as they do not rely on sample splitting, rather, they smooth the categorical variables in an appropriate manner; see Li and Racine (2007a) and the references therein for an in-depth treatment of these methods, and see the references listed in the bibliography. The np package implements recently developed kernel methods that seamlessly handle the mix of continuous, unordered, and ordered factor datatypes often found in applied settings.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.