Development and Validation of an Explainable Machine Learning Model for Major Complications After Cytoreductive Surgery

Hongbo Deng ,Keith F Fournier,Benjamin D Powers,Charles A Staley,Fabian M Johnston,Yuman Fong,Daniel E Abbott,Callisia N Clarke,Sameer H Patel,Byrne Lee,Laura Lambert ,Jula Veerapong,Cameron Carlin,Kara Vande Walle,Seán Dineen ,Sherif Abdel‐Misih ,Ryan J Hendrix ,Zahra Eftekhari,Jordan M Cloyd,Travis E Grotz,Mustafa Raoof

doi:10.1001/jamanetworkopen.2022.12930

Abstract

Cytoreductive surgery (CRS) is one of the most complex operations in surgical oncology with significant morbidity, and improved risk prediction tools are critically needed. Machine learning models can potentially overcome the limitations of traditional multiple logistic regression (MLR) models and provide accurate risk estimates. To develop and validate an explainable machine learning model for predicting major postoperative complications in patients undergoing CRS. This prognostic study used patient data from tertiary care hospitals with expertise in CRS included in the US Hyperthermic Intraperitoneal Chemotherapy Collaborative Database between 1998 and 2018. Information from 147 variables was extracted to predict the risk of a major complication. An ensemble-based machine learning (gradient-boosting) model was optimized on 80% of the sample with subsequent validation on a 20% holdout data set. The machine learning model was compared with traditional MLR models. The artificial intelligence SHAP (Shapley additive explanations) method was used for interpretation of patient- and cohort-level risk estimates and interactions to define novel surgical risk phenotypes. Data were analyzed between November 2019 and August 2021. Cytoreductive surgery. Area under the receiver operating characteristics (AUROC); area under the precision recall curve (AUPRC). Data from a total 2372 patients were included in model development (mean age, 55 years [range, 11-95 years]; 1366 [57.6%] women). The optimized machine learning model achieved high discrimination (AUROC: mean cross-validation, 0.75 [range, 0.73-0.81]; test, 0.74) and precision (AUPRC: mean cross-validation, 0.50 [range, 0.46-0.58]; test, 0.42). Compared with the optimized machine learning model, the published MLR model performed worse (test AUROC and AUPRC: 0.54 and 0.18, respectively). Higher volume of estimated blood loss, having pelvic peritonectomy, and longer operative time were the top 3 contributors to the high likelihood of major complications. SHAP dependence plots demonstrated insightful nonlinear interactive associations between predictors and major complications. For instance, high estimated blood loss (ie, above 500 mL) was only detrimental when operative time exceeded 9 hours. Unsupervised clustering of patients based on similarity of sources of risk allowed identification of 6 distinct surgical risk phenotypes. In this prognostic study using data from patients undergoing CRS, an optimized machine learning model demonstrated a superior ability to predict individual- and cohort-level risk of major complications vs traditional methods. Using the SHAP method, 6 distinct surgical phenotypes were identified based on sources of risk of major complications.

Full Text