Abstract

Abstract Background Accurate prediction of outcomes following a heart transplant is critical to explaining risks and benefits to patients and decision-making when considering potential organ offers. Given the large number of potential variables to be considered, this task may be most efficiently performed using machine learning (ML). Purpose We trained and tested different ML algorithms to accurately predict outcomes following a cardiac transplant using the United Network of Organ Sharing (UNOS) database. Methods We included 67 939 adult and pediatric patients enrolled in the UNOS database between January 1994 and December 2016 who underwent cardiac transplantation (median age 53 [IQR 38 – 60], 72.7% males). In our models, as an input, we included 114 features that have been collected from recipients and donors prior to transplant. The primary outcome was all-cause mortality at one-year post-transplant. We evaluated three different ML methods: XGBoost, Random Forest (RF) and L2 regularized logistic regression. Algorithms were trained and tested using shuffled 10-fold cross-validation (CV) as well as rolling CV. In the rolling CV, to mimic prospective procedure, ML models were trained by incrementally adding patients according to transplant year and testing models on the data in the following year. The hyperparameters, controlling the learning process, were tuned using Bayesian optimization. Prognostic accuracy for one-year all-cause mortality was characterized using the area under the receiver-operating characteristic curve (AUC). Results In total, 8,394 patients died within 1 year of transplant. We observed a substantial difference in prognostic accuracy between the shuffled 10-fold CV and the rolling CV. In the 10-fold CV, XGBoost and RF achieved high predictive performance with AUC of 0.848 (95% CI: 0.842–0.854) and 0.891 (95% CI: 0.886–0.896), respectively. In the rolling CV, which is a more realistic setting, AUC dropped to 0.673 (95% CI: 0.661–0.684) for XGBoost and 0.670 (0.657–0.683) for RF. Predictive performance of L2 regularized logistic regression remained stable across the two CV procedures, achieving AUC 0.669 (95% CI: 0.662–0.676) in the 10-fold shuffled CV and 0.665 (95% CI: 0.649–0.680) in the rolling CV procedure (Figure). Conclusions Our study suggests that ML models could be used to predict mortality in the first year post-transplant. We also show that the choice of CV procedure is crucial for evaluating ML models, particularly in data collected over a long period of time. The difference between the shuffled and rolling CV in the predictive performance of the tree-based ML models might indicate temporal dataset shift. In the rolling CV, all three methods achieved similar predictive performance. Funding Acknowledgement Type of funding sources: Public grant(s) – National budget only. Main funding source(s): Research Foundation Flanders (FWO)

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call