7 questions to ask when evaluating a noninferiority trial

,; ,; ,

Applied Evidence

7 questions to ask when evaluating a noninferiority trial

J Fam Pract. 2014 March;63(3):E4-E8

PDF Download

References

1. Le Henanff A, Giraudeau B, Baron G, et al. Quality of reporting of noninferiority and equivalence randomized trials. JAMA. 2006;295:1147-1151.

2. Piaggio G, Elbourne DR, Pocock SJ, et al; CONSORT Group. Reporting of noninferiority and equivalence randomized trials: extension of the CONSORT 2010 statement. JAMA. 2012;308:2594-2604.

3. Moher D, Schulz KF, Altman D; CONSORT Group (Consolidated Standards of Reporting Trials). The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. JAMA. 2001;285:1987-1991.

4. Connolly SJ, Ezekowitz MD, Yusuf S, et al; RE-LY Steering Committee and Investigators. Dabigatran versus warfarin in patients with atrial fibrillation. N Engl J Med. 2009;361:1139-1151.

5. Ezekowitz MD, Connolly S, Parekh A, et al. Rationale and design of RE-LY: randomized evaluation of long-term anticoagulant therapy, warfarin, compared with dabigatran. Am Heart J. 2009;157:805-810, 810.e1-2.

6. Hart RG, Benavente O, McBride R, et al. Antithrombotic therapy to prevent stroke in patients with atrial fibrillation: a meta-analysis. Ann Intern Med. 1999;131:492-501.

7. US Department of Health and Human Services. Guidance for industry non-inferiority clinical trials. US Food and Drug Administration Web site. March 2010. Available at: http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM202140.pdf. Accessed February 4, 2014.

4. Is the sample size appropriate and justified?

The sample size in a noninferiority trial should provide high power to reject the null hypothesis that the difference (or relative risk) between groups is equal to or greater than the noninferiority margin under some clinically meaningful assumption about the true difference (or absolute risk reduction) between groups. A true difference of 0 (or a relative risk of 1) is typically assumed for sample size calculation. However, assuming that the new treatment is truly slightly better or slightly worse than the standard may be clinically appropriate in some cases. This would indicate a need for a smaller or larger sample size, respectively, than that required under the usual assumption of no difference.

When the justification for the sample size in a noninferiority trial is not provided or the number of participants is based on an inappropriate approach (eg, using superiority trial calculations for a noninferiority trial), questions about the quality of the trial arise. The primary concern is whether the noninferiority margin was actually selected before the trial began, as it should have been. And if the researchers used overly optimistic assumptions about the efficacy of the new treatment relative to the standard therapy, the failure to rule out the margin could be misleading. (As with superiority trials that fail to reject the null hypothesis, post hoc power calculations should be avoided.) After the study has ended, the resulting CIs should be used to evaluate whether the study was large enough to adequately assess the relative effectiveness of the treatments.

The RE-LY trial calculated the sample size that was expected to provide 84% power to rule out the prespecified hazard ratio of 1.46, assuming a true event rate of 1.6% per year (presumably for both groups), a recruitment period of 2 years, and at least one year of follow-up. The sample size was subsequently increased from 15,000 to 18,000 to maintain power in case of a low event rate.^4,5

5. Is the noninferiority trial as similar as possible to the trial(s) comparing the standard treatment with placebo?

Characteristics of participants, setting, reference treatment, and outcomes used in a noninferiority trial should be as close as possible to those in the trial(s) comparing the treatment with placebo. This is known as the constancy assumption, and it is key to researchers’ ability to draw a conclusion about noninferiority.

The trials used to calculate the noninferiority margin and the RE-LY trial itself involved similar populations of patients with AF, and the outcome (stroke) was similar.

6. Is a per protocol analysis reported in the results?

In randomized controlled superiority trials, the participants should be analyzed in the groups to which they were originally allocated, regardless of whether they adhered to treatment during the entire follow-up period. Such intention-to-treat (ITT) analysis is important because it provides a more conservative estimate of treatment effect—taking into account that some people who are offered treatment will not accept it and others will discontinue treatment. An ITT analysis therefore tends to minimize treatment effects compared with a “per protocol” analysis, in which participants are analyzed according to the treatment they actually received and are often removed from the analysis if they discontinue or do not adhere to treatment.

Intention-to-treat analysis is important because it provides a more conservative estimate of treatment effect.In noninferiority trials, if patients in the intervention group cross over to the standard treatment group or those in the standard treatment group have poor adherence, an ITT analysis can increase the risk of wrongly claiming noninferiority.⁷ Therefore, a per protocol analysis should be included—and indeed may be preferable.

In RE-LY, ITT analyses were reported, and complete follow-up data were available for 99.9% of patients. However, the rates of treatment discontinuation at one year were about 15% for those on dabigatran and 10% for the warfarin group, and 21% and 17%, respectively, at 2 years.^4,5 If the new treatment were truly less efficacious than the standard treatment, these moderate discontinuation rates could lead to more similar rates of stroke in the 2 groups than would be expected with higher continuation rates, biasing results towards the alternative of noninferiority. Although the original publication of trial results did not include a per protocol analysis, the RE-LY authors later reported that a per protocol analysis yielded similar results to the ITT analysis.

7. Are the overall design and execution of the trial high quality?

Because a poor quality noninferiority trial can appear to demonstrate noninferiority, looking at such studies critically is crucial. Appropriate randomization, concealed allocation, masking, and careful attention to participant flow must all be assessed.^2,3

7 questions to ask when evaluating a noninferiority trial

References

Pages

Recommended Reading