We aim to develop the first risk prediction model for 30-day mortality for the Australian and New Zealand patient populations and examine whether machine learning (ML) algorithms outperform traditional statistical approaches. Data from the Australia New Zealand Congenital Outcomes Registry for Surgery, which contains information on every paediatric cardiac surgical encounter in Australian and New Zealand for patients aged <18 years between January 2013 and December 2021, were analysed (n = 14343). The outcome was mortality within the 30-day period following a surgical encounter, with ∼30% of the observations randomly selected to be used for validation of the final model. Three different ML methods were used, all of which employed five-fold cross-validation to prevent overfitting, with model performance judged primarily by the area under the receiver operating curve (AUC). Among the 14343 30-day periods, there were 188 deaths (1.3%). In the validation data, the gradient-boosted tree obtained the best performance [AUC = 0.87, 95% confidence interval = (0.82, 0.92); calibration = 0.97, 95% confidence interval = (0.72, 1.27)], outperforming penalized logistic regression and artificial neural networks (AUC of 0.82 and 0.81, respectively). The strongest predictors of mortality in the gradient boosting trees were patient weight, STAT score, age and gender. Our risk prediction model outperformed logistic regression and achieved a level of discrimination comparable to the PRAiS2 and Society of Thoracic Surgery Congenital Heart Surgery Database mortality risk models (both which obtained AUC = 0.86). Non-linear ML methods can be used to construct accurate clinical risk prediction tools.
Read full abstract