Prognostic models are becoming increasingly relevant in clinical trials as potential surrogate endpoints, and for patient management as clinical decision support tools. However, the impact of competing risks on model performance remains poorly investigated. We aimed to carefully assess the performance of competing risk and noncompeting risk models in the context of kidney transplantation, where allograft failure and death with a functioning graft are two competing outcomes. We included 11,046 kidney transplant recipients enrolled in 10 countries. We developed prediction models for long-term kidney graft failure prediction, without accounting (i.e., censoring) and accounting for the competing risk of death with a functioning graft, using Cox, Fine-Gray, and cause-specific Cox regression models. To this aim, we followed a detailed and transparent analytical framework for competing and noncompeting risk modelling, and carefully assessed the models' development, stability, discrimination, calibration, overall fit, clinical utility, and generalizability in external validation cohorts and subpopulations. More than 15 metrics were used to provide an exhaustive assessment of model performance. Among 11,046 recipients in the derivation and validation cohorts, 1,497 (14%) lost their graft and 1,003 (9%) died with a functioning graft after a median follow-up post-risk evaluation of 4.7 years (IQR 2.7-7.0). The cumulative incidence of graft loss was similarly estimated by Kaplan-Meier and Aalen-Johansen methods (17% versus 16% in the derivation cohort). Cox and competing risk models showed similar and stable risk estimates for predicting long-term graft failure (average mean absolute prediction error of 0.0140, 0.0138 and 0.0135 for Cox, Fine-Gray, and cause-specific Cox models, respectively). Discrimination and overall fit were comparable in the validation cohorts, with concordance index ranging from 0.76 to 0.87. Across various subpopulations and clinical scenarios, the models performed well and similarly, although in some high-risk groups (such as donors over 65 years old), the findings suggest a trend towards moderately improved calibration when using a competing risk approach. Competing and noncompeting risk models performed similarly in predicting long-term kidney graft failure.
Read full abstract