Abstract

Risk prediction models for postoperative mortality after intra-abdominal surgery have typically been developed using preoperative variables. It is unclear if intraoperative data add significant value to these risk prediction models. With IRB approval, an institutional retrospective cohort of intra-abdominal surgery patients in the 2005 to 2015 American College of Surgeons National Surgical Quality Improvement Program was identified. Intraoperative data were obtained from the electronic health record. The primary outcome was 30-day mortality. We evaluated the performance of machine learning algorithms to predict 30-day mortality using: 1) baseline variables and 2) baseline + intraoperative variables. Algorithms evaluated were: 1) logistic regression with elastic net selection, 2) random forest (RF), 3) gradient boosting machine (GBM), 4) support vector machine (SVM), and 5) convolutional neural networks (CNNs). Model performance was evaluated using the area under the receiver operator characteristic curve (AUROC). The sample was randomly divided into a training/testing split with 80%/20% probabilities. Repeated 10-fold cross-validation identified the optimal model hyperparameters in the training dataset for each model, which were then applied to the entire training dataset to train the model. Trained models were applied to the test cohort to evaluate model performance. Statistical significance was evaluated using P < .05. The training and testing cohorts contained 4322 and 1079 patients, respectively, with 62 (1.4%) and 15 (1.4%) experiencing 30-day mortality, respectively. When using only baseline variables to predict mortality, all algorithms except SVM (area under the receiver operator characteristic curve [AUROC], 0.83 [95% confidence interval {CI}, 0.69-0.97]) had AUROC >0.9: GBM (AUROC, 0.96 [0.94-1.0]), RF (AUROC, 0.96 [0.92-1.0]), CNN (AUROC, 0.96 [0.92-0.99]), and logistic regression (AUROC, 0.95 [0.91-0.99]). AUROC significantly increased with intraoperative variables with CNN (AUROC, 0.97 [0.96-0.99]; P = .047 versus baseline), but there was no improvement with GBM (AUROC, 0.97 [0.95-0.99]; P = .3 versus baseline), RF (AUROC, 0.96 [0.93-1.0]; P = .5 versus baseline), and logistic regression (AUROC, 0.94 [0.90-0.99]; P = .6 versus baseline). Postoperative mortality is predicted with excellent discrimination in intra-abdominal surgery patients using only preoperative variables in various machine learning algorithms. The addition of intraoperative data to preoperative data also resulted in models with excellent discrimination, but model performance did not improve.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call