Multiple imputation for patient reported outcome measures in randomised controlled trials: advantages and disadvantages of imputing at the item, subscale or composite score level

Ines Rombach,Crispin Jenkinson,Oliver Rivero-Arias,Alastair M Gray,David W Murray

doi:10.1186/s12874-018-0542-6

Ines Rombach, Crispin Jenkinson + Show 3 more

Open Access

https://doi.org/10.1186/s12874-018-0542-6

Copy DOI

Abstract

BackgroundMissing data can introduce bias in the results of randomised controlled trials (RCTs), but are typically unavoidable in pragmatic clinical research, especially when patient reported outcome measures (PROMs) are used. Traditionally applied to the composite PROMs score of multi-item instruments, some recent research suggests that multiple imputation (MI) at the item level may be preferable under certain scenarios.This paper presents practical guidance on the choice of MI models for handling missing PROMs data based on the characteristics of the trial dataset. The comparative performance of complete cases analysis, which is commonly used in the analysis of RCTs, is also considered.MethodsRealistic missing at random data were simulated using follow-up data from an RCT considering three different PROMs (Oxford Knee Score (OKS), EuroQoL 5 Dimensions 3 Levels (EQ-5D-3L), 12-item Short Form Survey (SF-12)). Data were multiply imputed at the item (using ordinal logit and predicted mean matching models), sub-scale and score level; unadjusted mean outcomes, as well as treatment effects from linear regression models were obtained for 1000 simulations. Performance was assessed by root mean square errors (RMSE) and mean absolute errors (MAE).ResultsConvergence problems were observed for MI at the item level. Performance generally improved with increasing sample sizes and lower percentages of missing data. Imputation at the score and subscale level outperformed imputation at the item level in small sample sizes (n ≤ 200). Imputation at the item level is more accurate for high proportions of item-nonresponse. All methods provided similar results for large sample sizes (≥500) in this particular case study.ConclusionsMany factors, including the prevalence of missing data in the study, sample size, the number of items within the PROM and numbers of levels within the individual items, and planned analyses need consideration when choosing an imputation model for missing PROMs data.

Highlights

Missing data can introduce bias in the results of randomised controlled trials (RCTs), but are typically unavoidable in pragmatic clinical research, especially when patient reported outcome measures (PROMs) are used
Covariates used in all multiple imputation (MI) models included the baseline composite PROM score, as well as all variables used in the analysis model and those used in the simulation of missing at random (MAR) data
It is likely that the MAR mechanism will be related to more variables outside the analysis model, and MI may be preferable to complete cases analysis (CCA) due to its ability to account for complex MAR mechanisms

Summary

Introduction

Missing data can introduce bias in the results of randomised controlled trials (RCTs), but are typically unavoidable in pragmatic clinical research, especially when patient reported outcome measures (PROMs) are used. Missing data can introduce bias in the results of randomised controlled trials (RCTs), which can have a negative impact on clinical decisions derived from them, and patient care. Patient reported outcome measures (PROMs), which are increasingly used in RCTs as primary or key secondary endpoints [1, 2], can be susceptible to containing missing data, either due to unasnwered or incomplete questionnaires [3, 4]. Missing data can affect the calculation of the composite score and/or subscales. Some scoring manuals allow for small amounts of missing items, while other scoring manuals do not facilitate the calculation of composite scores in the presence of any missing items

Methods

Results

Discussion

Conclusion