The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects

Manisha Desai,Denise A Esserman,Marilie D Gammon,Mary B Terry

doi:10.1186/1742-5573-8-5

Abstract

BackgroundIn molecular epidemiology studies biospecimen data are collected, often with the purpose of evaluating the synergistic role between a biomarker and another feature on an outcome. Typically, biomarker data are collected on only a proportion of subjects eligible for study, leading to a missing data problem. Missing data methods, however, are not customarily incorporated into analyses. Instead, complete-case (CC) analyses are performed, which can result in biased and inefficient estimates.MethodsThrough simulations, we characterized the performance of CC methods when interaction effects are estimated. We also investigated whether standard multiple imputation (MI) could improve estimation over CC methods when the data are not missing at random (NMAR) and auxiliary information may or may not exist.ResultsCC analyses were shown to result in considerable bias and efficiency loss. While MI reduced bias and increased efficiency over CC methods under specific conditions, it too resulted in biased estimates depending on the strength of the auxiliary data available and the nature of the missingness. In particular, CC performed better than MI when extreme values of the covariate were more likely to be missing, while MI outperformed CC when missingness of the covariate related to both the covariate and outcome. MI always improved performance when strong auxiliary data were available. In a real study, MI estimates of interaction effects were attenuated relative to those from a CC approach.ConclusionsOur findings suggest the importance of incorporating missing data methods into the analysis. If the data are MAR, standard MI is a reasonable method. Auxiliary variables may make this assumption more reasonable even if the data are NMAR. Under NMAR we emphasize caution when using standard MI and recommend it over CC only when strong auxiliary data are available. MI, with the missing data mechanism specified, is an alternative when the data are NMAR. In all cases, it is recommended to take advantage of MI's ability to account for the uncertainty of these assumptions.

Highlights

Recent advances in technology to measure biomarkers have given rise to increasingly more studies in molecular epidemiology
We investigate the consequences of applying multiple imputation (MI), which in its standard form relies on the missing at random (MAR) assumption, and assess the extent that auxiliary data can help in estimation performance when data from a covariate are not missing at random (NMAR) and the interest lies in estimating an interaction effect, involving the covariate
When missingness is a function of the covariate only and not of the outcome the performance of CC methods largely suffered in terms of efficiency loss

Summary

Introduction

Recent advances in technology to measure biomarkers have given rise to increasingly more studies in molecular epidemiology. Many epidemiology studies collect data from biospecimens for the purpose of studying the role of biomarkers in disease. Often these of missing data methods in epidemiology studies to their inaccessibility and complexity. Desai et al recently assessed the handling of missing data in molecular epidemiology studies and found that while the majority of studies had missing data (65%) and/or excluded subjects with missing data from study entry (45%), 88% of these utilized a CC analysis [4]. In molecular epidemiology studies biospecimen data are collected, often with the purpose of evaluating the synergistic role between a biomarker and another feature on an outcome. Complete-case (CC) analyses are performed, which can result in biased and inefficient estimates

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Epidemiologic Perspectives & Innovations	Publication Date: Oct 6, 2011
Citations: 37	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Epidemiologic Perspectives & Innovations

Lead the way for us

Similar Papers

Is using multiple imputation better than complete case analysis for estimating a prevalence (risk) difference in randomized controlled trials when binary outcome observations are missing?
Mavuto Mukaka ... Linda Kalilani-Phiri
Trials | VOL. 17
Mavuto Mukaka, et. al.Mavuto Mukaka ... Linda Kalilani-Phiri
22 Jul 2016
Trials | VOL. 17

What is missing from my missing data plan?
Sharon D Yeatts ... Renée H Martin
Stroke | VOL. 46
Sharon D Yeatts, et. al.Sharon D Yeatts ... Renée H Martin
07 May 2015
Stroke | VOL. 46

OP18 Using linked administrative data to reduce bias due to missing outcome data in exposure-outcome estimates: a study of the association between breastfeeding and iq using simulations and data from a birth cohort
Rp Cornish ... K Tilling
Journal of Epidemiology and Community Health | VOL. 69
Rp Cornish, et. al.Rp Cornish ... K Tilling
31 Aug 2015
Journal of Epidemiology and Community Health | VOL. 69

Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study
Andrea Marshall ... Patrick Royston
BMC Medical Research Methodology | VOL. 10
Andrea Marshall, et. al.Andrea Marshall ... Patrick Royston
19 Jan 2010
BMC Medical Research Methodology | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The use of complete-case and multiple imputation-based analyses in molecular epidemiology studies that assess interaction effects

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Epidemiologic Perspectives &amp; Innovations

More From: Epidemiologic Perspectives & Innovations