Empirical Consequences of Current Recommendations for the Design and Interpretation of Noninferiority Trials

Scott K Aberegg,Matthew H Samore,Andrew M Hersh

doi:10.1007/s11606-017-4161-4

Scott K Aberegg, Matthew H Samore + Show 1 more

Open Access

https://doi.org/10.1007/s11606-017-4161-4

Copy DOI

Journal: Journal of General Internal Medicine	Publication Date: Sep 5, 2017
Citations: 39	License type: open-access

Affiliation: University of Utah

Abstract

BackgroundNoninferiority trials are increasingly common, though they have less standardized designs and their interpretation is less familiar to clinicians than superiority trials.ObjectiveTo empirically evaluate a cohort of noninferiority trials to determine 1) their interpretation as recommended by CONSORT, 2) choice of alpha threshold and its sidedness, and 3) differences between methods of analysis such as intention-to-treat and per-protocol.DesignWe searched MEDLINE for parallel-group randomized controlled noninferiority trials published in the five highest-impact general medical journals between 2011 and 2016.Main MeasuresData abstracted included trial design parameters, results, and interpretation of results based on CONSORT recommendations.Key ResultsOne hundred sixty-three trials and 182 noninferiority comparisons were included in our analysis. Based on CONSORT-recommended interpretation, 79% of experimental therapies met criteria for noninferiority, 13% met criteria for superiority, 20% were declared inconclusive, and 2% met criteria for inferiority. However, for 12% of trials, the experimental therapy was statistically significantly worse than the active control, but CONSORT recommended an interpretation of inconclusive or noninferior. A two-sided alpha equivalent of greater than 0.05 was used in 34% of the trials, and in five of these trials, the use of a standard two-sided alpha of 0.05 led to changes in the interpretation of results that disfavored the experimental therapy. In four of the five comparisons where different methods of analysis (e.g., intention-to-treat and per-protocol) yielded different results, the intention-to-treat analysis was the more conservative. In 11% of trials, a secondary advantage of the new therapy was neither reported nor could it be inferred by reviewers.ConclusionsIn this cohort, the design and interpretation of noninferiority trials led to significant and systematic bias in favor of the experimental therapy. Clinicians should exercise caution when interpreting these trials. Future trials may be more reliable if design parameters are standardized.

Highlights

Noninferiority trials are used to compare a new therapy (NT) to an active control (AC) when the use of a placebo control is not ethically feasible
Figure 2 shows the results of our search; 160 included manuscripts reported the results of 163 distinct trials and 182 noninferiority comparisons reported for those trials
Almost one-third of the trials (32%) used a two-sided alpha equivalent greater than the conventional standard for superiority trials (0.05), with four trials using the equivalent of a two-sided alpha of 0.20

Summary

Introduction

Noninferiority trials are used to compare a new therapy (NT) to an active control (AC) when the use of a placebo control is not ethically feasible. OBJECTIVE: To empirically evaluate a cohort of noninferiority trials to determine 1) their interpretation as recommended by CONSORT, 2) choice of alpha threshold and its sidedness, and 3) differences between methods of analysis such as intention-to-treat and per-protocol. Based on CONSORT-recommended interpretation, 79% of experimental therapies met criteria for noninferiority, 13% met criteria for superiority, 20% were declared inconclusive, and 2% met criteria for inferiority. For 12% of trials, the experimental therapy was statistically significantly worse than the active control, but CONSORT recommended an interpretation of inconclusive or noninferior. CONCLUSIONS: In this cohort, the design and interpretation of noninferiority trials led to significant and systematic bias in favor of the experimental therapy. Future trials may be more reliable if design parameters are standardized

Objectives

Methods

Results

Discussion

Conclusion