Abstract

BackgroundWhen an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. We investigated the extent of this bias and examined whether the bias can be reduced through incorporating proxy outcomes obtained through linkage to administrative data as auxiliary variables in multiple imputation (MI).MethodsUsing data from the Avon Longitudinal Study of Parents and Children (ALSPAC) we estimated the association between breastfeeding and IQ (continuous outcome), incorporating linked attainment data (proxies for IQ) as auxiliary variables in MI models. Simulation studies explored the impact of varying the proportion of missing data (from 20 to 80%), the correlation between the outcome and its proxy (0.1–0.9), the strength of the missing data mechanism, and having a proxy variable that was incomplete.ResultsIncorporating a linked proxy for the missing outcome as an auxiliary variable reduced bias and increased efficiency in all scenarios, even when 80% of the outcome was missing. Using an incomplete proxy was similarly beneficial. High correlations (> 0.5) between the outcome and its proxy substantially reduced the missing information. Consistent with this, ALSPAC analysis showed inclusion of a proxy reduced bias and improved efficiency. Gains with additional proxies were modest.ConclusionsIn longitudinal studies with loss to follow-up, incorporating proxies for this study outcome obtained via linkage to external sources of data as auxiliary variables in MI models can give practically important bias reduction and efficiency gains when the study outcome is MNAR.

Highlights

  • When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased

  • If data are missing at random (MAR), a complete records analysis will produce an unbiased estimate of the exposure-outcome relationship only if missingness is unrelated to the outcome variable and all observed variables associated with missingness are included in the analysis model

  • If the data are MNAR, a standard implementation of multiple imputation (MI) will give a biased estimate of the exposure-outcome relationship whereas a complete records analysis will generally produce an unbiased estimate as long as missingness is unrelated to the outcome [1, 2]

Read more

Summary

Introduction

When an outcome variable is missing not at random (MNAR: probability of missingness depends on outcome values), estimates of the effect of an exposure on this outcome are often biased. If data are MAR, a complete records analysis will produce an unbiased estimate of the exposure-outcome relationship only if missingness is unrelated to the outcome variable and all observed variables associated with missingness are included in the analysis model. If a proxy for the missing variable is available through linkage to an administrative data source whose coverage amongst eligible individuals is greater than that of the study data, a set of plausible missingness mechanisms can be identified. These proxies can be used as auxiliary variables in multiple imputation (MI) and other models used to take account of missing data. We vary the degree of correlation between the original outcome and its proxy, the proportion of missing data, and the extent to which the outcome is MNAR

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.