Abstract
In longitudinal randomised trials and observational studies within a medical context, a composite outcome—which is a function of several individual patient-specific outcomes—may be felt to best represent the outcome of interest. As in other contexts, missing data on patient outcome, due to patient drop-out or for other reasons, may pose a problem. Multiple imputation is a widely used method for handling missing data, but its use for composite outcomes has been seldom discussed. Whilst standard multiple imputation methodology can be used directly for the composite outcome, the distribution of a composite outcome may be of a complicated form and perhaps not amenable to statistical modelling. We compare direct multiple imputation of a composite outcome with separate imputation of the components of a composite outcome. We consider two imputation approaches. One approach involves modelling each component of a composite outcome using standard likelihood-based models. The other approach is to use linear increments methods. A linear increments approach can provide an appealing alternative as assumptions concerning both the missingness structure within the data and the imputation models are different from the standard likelihood-based approach. We compare both approaches using simulation studies and data from a randomised trial on early rheumatoid arthritis patients. Results suggest that both approaches are comparable and that for each, separate imputation offers some improvement on the direct imputation of a composite outcome.
Highlights
When study patients are followed longitudinally, many patient-specific outcomes may be collected over time
We examine models for the multiple imputation of missing composite outcomes in longitudinal studies, where the time points at which observations are made are fixed by design
Since fewer assumptions are made in the linear increments (LI) multiple imputation process when compared to the maximum likelihood estimation (MLE) multiple imputation process
Summary
When study patients are followed longitudinally, many patient-specific outcomes may be collected over time. For clinical trials in rheumatoid arthritis, the American College of Rheumatology 20 % composite outcome, denoted ACR20, combines information on several variables concerning disease severity into a binary indicator based on which and how many of these variables have demonstrated 20 % reductions over time. Whilst it is simple to focus solely on a ‘complete case’ analysis, based only on data for patients who have completely observed data at one or more time points, multiple imputation is widely recognised as useful to guard against biased inferences, those owing to unrepresentative complete case data [10,14,17,18,20]. First introduced by Rubin [15] and described extensively in [12], generally involves the assumption of a structure for the relationship between the observed and the missing data, the fitting of this model to the ‘complete case’ responses and the use of the fitted model to predict outcomes where missing values exist. The model from which imputations are drawn is usually fully parametric and can be fitted using maximum likelihood (ML) methods
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have