Abstract

Estimating the parameters of a regression model of interest is complicated by missing data on the variables in that model. Multiple imputation is commonly used to handle these missing data. Joint model multiple imputation and full-conditional specification multiple imputation are known to yield imputed data with the same asymptotic distribution when the conditional models of full-conditional specification are compatible with that joint model. We show that this asymptotic equivalence of imputation distributions does not imply that joint model multiple imputation and full-conditional specification multiple imputation will also yield asymptotically equally efficient inference about the parameters of the model of interest, nor that they will be equally robust to misspecification of the joint model. When the conditional models used by full-conditional specification multiple imputation are linear, logistic and multinomial regressions, these are compatible with a restricted general location joint model. We show that multiple imputation using the restricted general location joint model can be substantially more asymptotically efficient than full-conditional specification multiple imputation, but this typically requires very strong associations between variables. When associations are weaker, the efficiency gain is small. Moreover, full-conditional specification multiple imputation is shown to be potentially much more robust than joint model multiple imputation using the restricted general location model to mispecification of that model when there is substantial missingness in the outcome variable.

Highlights

  • Estimating the parameters of a regression model of interest is often complicated in practice by missing data on the variables in that model

  • Our goals are (i) to demonstrate that when the restricted general location (RGL) model is correctly specified, asymptotic equivalence of imputation distributions does not imply asymptotically efficient estimators of b; (ii) to investigate the magnitude of this difference and how it depends on the strength of associations between outcome and covariates in the analysis model and (iii) to demonstrate that when the joint distribution of the covariates implied by the RGL model is misspecified, full-conditional specification (FCS) Multiple imputation (MI) can be less biased than joint model MI

  • FCS and joint model MI yield imputed data with the same asymptotic distribution when the conditional models used by FCS MI are compatible with the joint model

Read more

Summary

Introduction

Estimating the parameters of a regression model of interest (the ‘analysis model’) is often complicated in practice by missing data on the variables in that model. The result is an imputed dataset, in which there are no missing data. This imputation is done multiple (say, M) times and the analysis model is fitted separately to each of the resulting M imputed datasets to produce M estimates of the parameters b of this model. These M estimates are averaged to give an overall estimate of b, known as the ‘Rubin’s Rules (point) estimate’

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.