Abstract

Missing covariate data commonly occur in epidemiological and clinical research, and are often dealt with using multiple imputation. Imputation of partially observed covariates is complicated if the substantive model is non-linear (e.g. Cox proportional hazards model), or contains non-linear (e.g. squared) or interaction terms, and standard software implementations of multiple imputation may impute covariates from models that are incompatible with such substantive models. We show how imputation by fully conditional specification, a popular approach for performing multiple imputation, can be modified so that covariates are imputed from models which are compatible with the substantive model. We investigate through simulation the performance of this proposal, and compare it with existing approaches. Simulation results suggest our proposal gives consistent estimates for a range of common substantive models, including models which contain non-linear covariate effects or interactions, provided data are missing at random and the assumed imputation models are correctly specified and mutually compatible. Stata software implementing the approach is freely available.

Highlights

  • Missing data are a pervasive problem in both experimental and observational medical research, causing a loss of information and potentially biasing inferences

  • In accordance with the results of White and Royston,[16] fully conditional specification (FCS) resulted in somewhat biased estimates, with the bias larger for the coefficient corresponding to the continuous covariate, confidence intervals (CIs) coverage for both 1 and 2 was approximately 95%

  • When the substantive model contains non-linearities or interactions, existing imputation approaches using the FCS algorithm may give biased estimates because the imputation models are incompatible with the substantive model

Read more

Summary

Introduction

Missing data are a pervasive problem in both experimental and observational medical research, causing a loss of information and potentially biasing inferences. A popular alternative to joint model MI is the fully conditional specification (FCS) approach.[4,5] FCS MI involves specifying a series of univariate models for the conditional distribution of each partially observed variable given the other variables. This permits a great deal of flexibility, since an appropriate regression model can be selected for each variable (e.g. linear regression for continuous variables, logistic regression for binary variables). FCS MI is appealing in settings in which a number of variables have missing data, some of which are continuous and some of which are discrete

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.