Comparison of aggregate and individual participant data approaches to meta-analysis of randomised trials: An observational study.

Jayne F Tierney,Mahesh K B Parmar,Lesley A Stewart,Sarah Burdett,David J Fisher,Steven D Shapiro

doi:10.1371/journal.pmed.1003019

Abstract

BackgroundIt remains unclear when standard systematic reviews and meta-analyses that rely on published aggregate data (AD) can provide robust clinical conclusions. We aimed to compare the results from a large cohort of systematic reviews and meta-analyses based on individual participant data (IPD) with meta-analyses of published AD, to establish when the latter are most likely to be reliable and when the IPD approach might be required.Methods and findingsWe used 18 cancer systematic reviews that included IPD meta-analyses: all of those completed and published by the Meta-analysis Group of the MRC Clinical Trials Unit from 1991 to 2010. We extracted or estimated hazard ratios (HRs) and standard errors (SEs) for survival from trial reports and compared these with IPD equivalents at both the trial and meta-analysis level. We also extracted or estimated the number of events. We used paired t tests to assess whether HRs and SEs from published AD differed on average from those from IPD. We assessed agreement, and whether this was associated with trial or meta-analysis characteristics, using the approach of Bland and Altman. The 18 systematic reviews comprised 238 unique trials or trial comparisons, including 37,082 participants. A HR and SE could be generated for 127 trials, representing 53% of the trials and approximately 79% of eligible participants. On average, trial HRs derived from published AD were slightly more in favour of the research interventions than those from IPD (HRAD to HRIPD ratio = 0.95, p = 0.007), but the limits of agreement show that for individual trials, the HRs could deviate substantially. These limits narrowed with an increasing number of participants (p < 0.001) or a greater number (p < 0.001) or proportion (p < 0.001) of events in the AD. On average, meta-analysis HRs from published AD slightly tended to favour the research interventions whether based on fixed-effect (HRAD to HRIPD ratio = 0.97, p = 0.088) or random-effects (HRAD to HRIPD ratio = 0.96, p = 0.044) models, but the limits of agreement show that for individual meta-analyses, agreement was much more variable. These limits tended to narrow with an increasing number (p = 0.077) or proportion of events (p = 0.11) in the AD. However, even when the information size of the AD was large, individual meta-analysis HRs could still differ from their IPD equivalents by a relative 10% in favour of the research intervention to 5% in favour of control. We utilised the results to construct a decision tree for assessing whether an AD meta-analysis includes sufficient information, and when estimates of effects are most likely to be reliable. A lack of power at the meta-analysis level may have prevented us identifying additional factors associated with the reliability of AD meta-analyses, and we cannot be sure that our results are generalisable to all outcomes and effect measures.ConclusionsIn this study we found that HRs from published AD were most likely to agree with those from IPD when the information size was large. Based on these findings, we provide guidance for determining systematically when standard AD meta-analysis will likely generate robust clinical conclusions, and when the IPD approach will add considerable value.

Highlights

It remains unclear when standard systematic reviews and meta-analyses of published aggregate data (AD) are reliable enough to form robust clinical conclusions, and when the ‘gold standard’ individual participant data (IPD) approach might be required
In this study we found that hazard ratio (HR) from published AD were most likely to agree with those from IPD when the information size was large
We provide guidance for determining systematically when standard AD meta-analysis will likely generate robust clinical conclusions, and when the IPD approach will add considerable value

Summary

Introduction

It remains unclear when standard systematic reviews and meta-analyses of published aggregate data (AD) are reliable enough to form robust clinical conclusions, and when the ‘gold standard’ individual participant data (IPD) approach might be required. There are additional considerations for AD meta-analyses evaluating the effects of interventions on time-toevent outcomes, which are frequently based on hazard ratios (HRs), either derived directly from trial publications, or estimated indirectly from published statistics or from data extracted from Kaplan–Meier (KM) curves [4,5,6]. Each of these methods requires stronger and more assumptions, which, together with varying lengths of follow-up, could have repercussions for the reliability of the results. We aimed to compare the results from a large cohort of systematic reviews and meta-analyses based on individual participant data (IPD) with meta-analyses of published AD, to establish when the latter are most likely to be reliable and when the IPD approach might be required

Objectives

Methods

Results

Conclusion