Accounting for missing data in statistical analyses: multiple imputation is not always the answer.

Rachael A Hughes,Jonathan A C Sterne,Jon Heron,Kate Tilling

doi:10.1093/ije/dyz032

Rachael A Hughes, Jonathan A C Sterne + Show 2 more

Open Access

https://doi.org/10.1093/ije/dyz032

Copy DOI

Journal: International Journal of Epidemiology	Publication Date: Mar 16, 2019
Citations: 476	License type: CC BY 4.0

Affiliation: University of Bristol

Abstract

BackgroundMissing data are unavoidable in epidemiological research, potentially leading to bias and loss of precision. Multiple imputation (MI) is widely advocated as an improvement over complete case analysis (CCA). However, contrary to widespread belief, CCA is preferable to MI in some situations.MethodsWe provide guidance on choice of analysis when data are incomplete. Using causal diagrams to depict missingness mechanisms, we describe when CCA will not be biased by missing data and compare MI and CCA, with respect to bias and efficiency, in a range of missing data situations. We illustrate selection of an appropriate method in practice.ResultsFor most regression models, CCA gives unbiased results when the chance of being a complete case does not depend on the outcome after taking the covariates into consideration, which includes situations where data are missing not at random. Consequently, there are situations in which CCA analyses are unbiased while MI analyses, assuming missing at random (MAR), are biased. By contrast MI, unlike CCA, is valid for all MAR situations and has the potential to use information contained in the incomplete cases and auxiliary variables to reduce bias and/or improve precision. For this reason, MI was preferred over CCA in our real data example.ConclusionsChoice of method for dealing with missing data is crucial for validity of conclusions, and should be based on careful consideration of the reasons for the missing data, missing data patterns and the availability of auxiliary information.

Highlights

Failure to appropriately account for missing data in analyses may lead to bias and loss of precision (‘inefficiency’).[1]
Using causal diagrams to depict missingness mechanisms, we describe when complete case analysis (CCA) will not be biased by missing data and compare multiple imputation (MI) and CCA, with respect to bias and efficiency, in a range of missing data situations
There are situations in which CCA analyses are unbiased while MI analyses, assuming missing at random (MAR), are biased

Summary

Introduction

Failure to appropriately account for missing data in analyses may lead to bias and loss of precision (‘inefficiency’).[1].

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Accounting for missing data in statistical analyses: multiple imputation is not always the answer.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Epidemiology

Lead the way for us

Similar Papers

030 The effect of missing data on the relationship between lifecourse socio-economic position and verbal cognitive ability at older ages
R Landy ... R Hardy
Journal of Epidemiology and Community Health | VOL. 64
R Landy, et. al.R Landy ... R Hardy
01 Sep 2010
Journal of Epidemiology and Community Health | VOL. 64

How handling missing data may impact conclusions: A comparison of six different imputation methods for categorical questionnaire data.
Marianne Riksheim Stavseth ... Thomas Clausen
SAGE Open Medicine | VOL. 7
Marianne Riksheim Stavseth, et. al.Marianne Riksheim Stavseth ... Thomas Clausen
01 Jan 2019
SAGE Open Medicine | VOL. 7

What is missing from my missing data plan?
Sharon D Yeatts ... Renée H Martin
Stroke | VOL. 46
Sharon D Yeatts, et. al.Sharon D Yeatts ... Renée H Martin
07 May 2015
Stroke | VOL. 46

Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study
Andrea Marshall ... Roger L Holder
BMC Medical Research Methodology | VOL. 10
Andrea Marshall, et. al.Andrea Marshall ... Roger L Holder
19 Jan 2010
BMC Medical Research Methodology | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Accounting for missing data in statistical analyses: multiple imputation is not always the answer.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Epidemiology