Abstract
Multiple imputation has become one of the most popular approaches for handling missing data in statistical analyses. Part of this success is due to Rubin’s simple combination rules. These give frequentist valid inferences when the imputation and analysis procedures are so-called congenial and the embedding model is correctly specified, but otherwise may not. Roughly speaking, congeniality corresponds to whether the imputation and analysis models make different assumptions about the data. In practice, imputation models and analysis procedures are often not congenial, such that tests may not have the correct size, and confidence interval coverage deviates from the advertised level. We examine a number of recent proposals which combine bootstrapping with multiple imputation and determine which are valid under uncongeniality and model misspecification. Imputation followed by bootstrapping generally does not result in valid variance estimates under uncongeniality or misspecification, whereas certain bootstrap followed by imputation methods do. We recommend a particular computationally efficient variant of bootstrapping followed by imputation.
Highlights
Multiple imputation (MI) has proven to be an extremely versatile and popular tool for handling missing data in statistical analyses
We investigate the properties of the different combinations of MI and bootstrap which have been recommended by these previous papers, giving particular emphasis to their validity under uncongeniality or model misspecification
We have reviewed a number of proposals for combining MI with bootstrapping, in particular with regards to their statistical validity when imputation and analysis procedures are uncongenial or misspecified
Summary
Multiple imputation (MI) has proven to be an extremely versatile and popular tool for handling missing data in statistical analyses. Rubin’s variance estimator combines the average within-imputation variance with the betweenimputation variance in estimates This requires an estimator of the complete data variance, which for most estimators is available analytically. On the basis of theoretical and empirical investigation, they recommended three of the four variants for use They did not explicitly seek to investigate performance under uncongeniality or model misspecification . We investigate the properties of the different combinations of MI and bootstrap which have been recommended by these previous papers, giving particular emphasis to their validity under uncongeniality or model misspecification.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.