Abstract Background Numerous randomized clinical trials (RCTs) have evaluated novel therapies for inflammatory bowel disease (IBD). However, these trials often rely solely on P values to determine treatment efficacy, which may not fully capture the robustness of the results. This study aimed to evaluate the fragility of these RCTs and to identify study characteristics that influence result stability. Methods A search for RCTs assessing the efficacy of novel IBD therapies, including biologics, small molecule inhibitors, fecal microbiota transplantation, and stem cell therapies, was conducted in PubMed and Web of Science from January 1, 2014, to November 10, 2024. Included RCTs were required to report at least one statistically significant result (P <0.05) and employ an equal randomization design. The FI and continuous FI (CFI) were computed for binary and continuous outcomes, respectively, as reported in previous studies1, 2. We used Wilcoxon rank-sum and Kruskal-Wallis tests for dichotomous and multi-categorical characteristics, respectively, and Spearman’s rank correlation for continuous characteristics. Multiple linear regression was used to further explore the impact of study characteristics on FI. Results A total of 129 trials from the 53 studies were analyzed. The median FI was 6, and the CFI was 14.8. FI varied significantly by treatment types, trial phases, outcome types, and reported P values. The total sample size (ρ = 0.734) and number of discontinuations (ρ = 0.479) exhibited strong and moderate positive correlations with FI, respectively, while publication year, impact factor, and events percentage showed weak positive correlations. CFI was significantly influenced by outcome types, and moderate and strong correlations were observed with journal impact factor (ρ = 0.426) and sample size (ρ = 0.757), respectively. After adjusting for other characteristics, the association between treatment type (biologics or small molecule drugs vs. fecal microbiota transplantation or stem cell therapy) and the natural logarithm of FI remained statistically significant (B = 0.283). Primary or coprimary outcome showed greater robustness compared to other outcomes in both the FI (B = 0.288) and CFI models (B = 0.459). Sample size was positively associated with both the natural logarithm of FI (B = 0.001) and CFI (B = 0.001), while the per protocol analysis demonstrated a negative association (B = -0.929). Conclusion RCTs evaluating novel IBD therapies show varying levels of robustness, influenced by multiple study characteristics. Fragility should be considered in future research when interpreting the efficacy.
Read full abstract