Abstract

Hot-deck imputation is an intuitively simple and popular method of accommodating incomplete data. Users of the method will often use the usual multiple imputation variance estimator which is not appropriate in this case. However, no variance expression has yet been derived for this easily implemented method applied to missing covariates in regression models. The simple hot-deck method is in fact asymptotically equivalent to the mean-score method for the estimation of a regression model parameter, so that hot-deck can be understood in the context of likelihood methods. Both of these methods accommodate data where missingness may depend on the observed variables but not on the unobserved value of the incomplete covariate, that is, missing at random (MAR). The asymptotic properties of hot-deck are derived here for the case where the fully observed variables are categorical, though the incomplete covariate(s) may be continuous. Simulation studies indicate that the two methods compare well in small samples and for small numbers of imputations. Current users of hot-deck may now conduct their analysis using mean-score, which is a weighted likelihood method and can thus be implemented by a single pass through the data using any standard package which accommodates weighted regression models. Valid inference is now straightforward using the variance expression provided here. The equivalence of mean-score and hot-deck is illustrated using three clinical data sets where an important covariate is missing for a large number of study subjects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call