Abstract

Multiple imputation has become one of the most popular approaches for handling missing data in statistical analyses. Part of this success is due to Rubin’s simple combination rules. These give frequentist valid inferences when the imputation and analysis procedures are so-called congenial and the embedding model is correctly specified, but otherwise may not. Roughly speaking, congeniality corresponds to whether the imputation and analysis models make different assumptions about the data. In practice, imputation models and analysis procedures are often not congenial, such that tests may not have the correct size, and confidence interval coverage deviates from the advertised level. We examine a number of recent proposals which combine bootstrapping with multiple imputation and determine which are valid under uncongeniality and model misspecification. Imputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas certain bootstrap followed by imputation methods do. We recommend a particular computationally efficient variant of bootstrapping followed by imputation.

Highlights

  • Multiple imputation (MI) has proven to be an extremely versatile and popular tool for handling missing data in statistical analyses

  • We investigate the properties of the different combinations of MI and bootstrap which have been recommended by these previous papers, giving particular emphasis to their validity under uncongeniality or model misspecification

  • We have reviewed a number of proposals for combining MI with bootstrapping, in particular with regards to their statistical validity when imputation and analysis procedures are uncongenial or misspecified

Read more

Summary

Introduction

Multiple imputation (MI) has proven to be an extremely versatile and popular tool for handling missing data in statistical analyses. Rubin’s variance estimator combines the average within-imputation variance with the betweenimputation variance in estimates This requires an estimator of the complete data variance, which for most estimators is available analytically. On the basis of theoretical and empirical investigation, they recommended three of the four variants for use They did not explicitly seek to investigate performance under uncongeniality or model misspecification . We investigate the properties of the different combinations of MI and bootstrap which have been recommended by these previous papers, giving particular emphasis to their validity under uncongeniality or model misspecification.

Rubin’s rules
Congeniality
Imputation followed by bootstrapping
Bootstrap followed by MI
Boot MI von Hippel
Regression models under uncongeniality or misspecification
Reference-based imputation in clinical trials
Discussion
Declaration of conflicting interests
Findings
Methods
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.