Abstract

BackgroundLongitudinal categorical variables are sometimes restricted in terms of how individuals transition between categories over time. For example, with a time-dependent measure of smoking categorised as never-smoker, ex-smoker, and current-smoker, current-smokers or ex-smokers cannot transition to a never-smoker at a subsequent wave. These longitudinal variables often contain missing values, however, there is little guidance on whether these restrictions need to be accommodated when using multiple imputation methods. Multiply imputing such missing values, ignoring the restrictions, could lead to implausible transitions.MethodsWe designed a simulation study based on the Longitudinal Study of Australian Children, where the target analysis was the association between (incomplete) maternal smoking and childhood obesity. We set varying proportions of data on maternal smoking to missing completely at random or missing at random. We compared the performance of fully conditional specification with multinomial and ordinal logistic imputation, and predictive mean matching, two-fold fully conditional specification, indicator based imputation under multivariate normal imputation with projected distance-based rounding, and continuous imputation under multivariate normal imputation with calibration, where each of these multiple imputation methods were applied, accounting for the restrictions using a semi-deterministic imputation procedure.ResultsOverall, we observed reduced bias when applying multiple imputation methods with restrictions, and fully conditional specification with predictive mean matching performed the best. Applying fully conditional specification and two-fold fully conditional specification for imputing nominal variables based on multinomial logistic regression had severe convergence issues. Both imputation methods under multivariate normal imputation produced biased estimates when restrictions were not accommodated, however, we observed substantial reductions in bias when restrictions were applied with continuous imputation under multivariate normal imputation with calibration.ConclusionIn a similar longitudinal setting we recommend the use of fully conditional specification with predictive mean matching, with restrictions applied during the imputation stage.

Highlights

  • Longitudinal categorical variables are sometimes restricted in terms of how individuals transition between categories over time

  • We report the findings of a case study from the Longitudinal Study of Australian Children (LSAC), and a simulation study based on the LSAC [13] where approximately 65% of data on maternal smoking were set to missing completely at random (MCAR) or missing at random (MAR)

  • As expected we observed minimal bias under complete case analysis (CCA) and available case analysis (ACA) when data were MCAR, with the relative bias not exceeding 3% (Figs. 3a and 4a). In both MCAR scenarios, the multiple imputation (MI) methods (FCS with predictive mean matching (PMM), indicator-PDBR and continuous-calibration) produced more biased estimates than CCA and ACA

Read more

Summary

Introduction

Longitudinal categorical variables are sometimes restricted in terms of how individuals transition between categories over time. With a time-dependent measure of smoking categorised as neversmoker, ex-smoker, and current-smoker, current-smokers or ex-smokers cannot transition to a never-smoker at a subsequent wave. These longitudinal variables often contain missing values, there is little guidance on whether these restrictions need to be accommodated when using multiple imputation methods. Imputing such missing values, ignoring the restrictions, could lead to implausible transitions. The incomplete dataset is replicated multiple times, with the missing values replaced by values drawn from an appropriate imputation model. Multivariate normal imputation (MVNI), and fully conditional specification (FCS), are widely available MI methods that have been used in longitudinal studies [4, 5] to impute missing values

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call