The Utility of Nonparametric Transformations for Imputation of Survey Data

Michael W. Robbins

doi:10.2478/jos-2014-0043

Michael W. Robbins

Open Access

https://doi.org/10.2478/jos-2014-0043

Copy DOI

Journal: Journal of Official Statistics	Publication Date: Dec 1, 2014
Citations: 2	License type: CC BY-NC-ND 3.0

Affiliation: RAND Corporation

Abstract

Abstract Missing values present a prevalent problem in the analysis of establishment survey data. Multivariate imputation algorithms (which are used to fill in missing observations) tend to have the common limitation that imputations for continuous variables are sampled from Gaussian distributions. This limitation is addressed here through the use of robust marginal transformations. Specifically, kernel-density and empirical distribution-type transformations are discussed and are shown to have favorable properties when used for imputation of complex survey data. Although such techniques have wide applicability (i.e., they may be easily applied in conjunction with a wide array of imputation techniques), the proposed methodology is applied here with an algorithm for imputation in the USDA’s Agricultural Resource Management Survey. Data analysis and simulation results are used to illustrate the specific advantages of the robust methods when compared to the fully parametric techniques and to other relevant techniques such as predictive mean matching. To summarize, transformations based upon parametric densities are shown to distort several data characteristics in circumstances where the parametric model is ill fit; however, no circumstances are found in which the transformations based upon parametric models outperform the nonparametric transformations. As a result, the transformation based upon the empirical distribution (which is the most computationally efficient) is recommended over the other transformation procedures in practice.

Full Text