Multiple imputation of multiple multi-item scales when a full imputation model is infeasible.

Catrin O Plumpton,Ian R White,Dyfrig A Hughes,Tim Morris

doi:10.1186/s13104-016-1853-5

Abstract

BackgroundMissing data in a large scale survey presents major challenges. We focus on performing multiple imputation by chained equations when data contain multiple incomplete multi-item scales. Recent authors have proposed imputing such data at the level of the individual item, but this can lead to infeasibly large imputation models.MethodsWe use data gathered from a large multinational survey, where analysis uses separate logistic regression models in each of nine country-specific data sets. In these data, applying multiple imputation by chained equations to the individual scale items is computationally infeasible. We propose an adaptation of multiple imputation by chained equations which imputes the individual scale items but reduces the number of variables in the imputation models by replacing most scale items with scale summary scores. We evaluate the feasibility of the proposed approach and compare it with a complete case analysis. We perform a simulation study to compare the proposed method with alternative approaches: we do this in a simplified setting to allow comparison with the full imputation model.ResultsFor the case study, the proposed approach reduces the size of the prediction models from 134 predictors to a maximum of 72 and makes multiple imputation by chained equations computationally feasible. Distributions of imputed data are seen to be consistent with observed data. Results from the regression analysis with multiple imputation are similar to, but more precise than, results for complete case analysis; for the same regression models a 39 % reduction in the standard error is observed. The simulation shows that our proposed method can perform comparably against the alternatives.ConclusionsBy substantially reducing imputation model sizes, our adaptation makes multiple imputation feasible for large scale survey data with multiple multi-item scales. For the data considered, analysis of the multiply imputed data shows greater power and efficiency than complete case analysis. The adaptation of multiple imputation makes better use of available data and can yield substantively different results from simpler techniques.Electronic supplementary materialThe online version of this article (doi:10.1186/s13104-016-1853-5) contains supplementary material, which is available to authorized users.

Highlights

Missing data in a large scale survey presents major challenges
Missing data is ubiquitous in research, and survey data is prone to incomplete responses
Assumptions must be made about the mechanism of missingness; no analysis with missing data is free of such assumptions

Summary

Introduction

Missing data in a large scale survey presents major challenges. We focus on performing multiple imputation by chained equations when data contain multiple incomplete multi-item scales. Missing data is ubiquitous in research, and survey data is prone to incomplete responses. Data may be missing completely at random (MCAR), where the probability of missing data is not dependent on either the observed or unobserved data. When data is missing at random (MAR), the probability of the data being missing does not depend upon the unobserved data, but Plumpton et al BMC Res Notes (2016) 9:45 missingness may be related to the observed data. Data may be missing not at random (MNAR), whereby missingness is dependent upon the values of the unobserved data, conditional on the observed data [1,2,3]. It is acknowledged that a gap still exists between techniques recommended by methodological literature and those employed in practice; traditional ad-hoc techniques such as deletion and single imputation techniques are still applied routinely [3, 5, 6]

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Research Notes	Publication Date: Jan 26, 2016
Citations: 70	License type: cc-by

R Discovery Prime

R Discovery Prime

Multiple imputation of multiple multi-item scales when a full imputation model is infeasible.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Research Notes

Lead the way for us

Similar Papers

Comparison of Methods for Processing Missing Values in Large Sample Survey Data
Lingling Wang ... Ruoran Lyu
Science Journal of Public Health | VOL. 7
Lingling Wang, et. al.Lingling Wang ... Ruoran Lyu
01 Jan 2019
Science Journal of Public Health | VOL. 7

Recovery of information from multiple imputation: a simulation study
Katherine J Lee ... John B Carlin
Emerging Themes in Epidemiology | VOL. 9
Katherine J Lee, et. al.Katherine J Lee ... John B Carlin
13 Jun 2012
Emerging Themes in Epidemiology | VOL. 9

Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome
Melissa Middleton ... Katherine J Lee
BMC Medical Research Methodology | VOL. 22
Melissa Middleton, et. al.Melissa Middleton ... Katherine J Lee
03 Apr 2022
BMC Medical Research Methodology | VOL. 22

A comparison of multiple imputation strategies for handling missing data in multi-item scales: Guidance for longitudinal studies.
Rheanna Mainzer ... Jemishabye Apajee
Statistics in Medicine | VOL. 40
Rheanna Mainzer, et. al.Rheanna Mainzer ... Jemishabye Apajee
08 Jun 2021
Statistics in Medicine | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multiple imputation of multiple multi-item scales when a full imputation model is infeasible.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Research Notes