[Construction of a predictive model for in-hospital mortality of sepsis patients in intensive care unit based on machine learning].

Cuiping Hao,Sujuan Tang,Chunying Hu,Qinghe Hu,Manchen Zhu,Yanchun Qian,Yinyan He

doi:10.3760/cma.j.cn121430-20221219-01104

Abstract

To analyze the risk factors of in-hospital death in patients with sepsis in the intensive care unit (ICU) based on machine learning, and to construct a predictive model, and to explore the predictive value of the predictive model. The clinical data of patients with sepsis who were hospitalized in the ICU of the Affiliated Hospital of Jining Medical University from April 2015 to April 2021 were retrospectively analyzed,including demographic information, vital signs, complications, laboratory examination indicators, diagnosis, treatment, etc. Patients were divided into death group and survival group according to whether in-hospital death occurred. The cases in the dataset (70%) were randomly selected as the training set for building the model, and the remaining 30% of the cases were used as the validation set. Based on seven machine learning models including logistic regression (LR), K-nearest neighbor (KNN), support vector machine (SVM), decision tree (DT), random forest (RF), extreme gradient boosting (XGBoost) and artificial neural network (ANN), a prediction model for in-hospital mortality of sepsis patients was constructed. The receiver operator characteristic curve (ROC curve), calibration curve and decision curve analysis (DCA) were used to evaluate the predictive performance of the seven models from the aspects of identification, calibration and clinical application, respectively. In addition, the predictive model based on machine learning was compared with the sequential organ failure assessment (SOFA) and acute physiology and chronic health evaluation II (APACHE II) models. A total of 741 patients with sepsis were included, of which 390 were discharged after improvement, 351 died in hospital, and the in-hospital mortality was 47.4%. There were significant differences in gender, age, APACHE II score, SOFA score, Glasgow coma score (GCS), heart rate, oxygen index (PaO2/FiO2), mechanical ventilation ratio, mechanical ventilation time, proportion of norepinephrine (NE) used, maximum NE, lactic acid (Lac), activated partial thromboplastin time (APTT), albumin (ALB), serum creatinine (SCr), blood urea nitrogen (BUN), blood uric acid (BUA), pH value, base excess (BE), and K+ between the death group and the survival group. ROC curve analysis showed that the area under the curve (AUC) of RF, XGBoost, LR, ANN, DT, SVM, KNN models, SOFA score, and APACHE II score for predicting in-hospital mortality of sepsis patients were 0.871, 0.846, 0.751, 0.747, 0.677, 0.657, 0.555, 0.749 and 0.760, respectively. Among all the models, the RF model had the highest precision (0.750), accuracy (0.785), recall (0.773), and F1 score (0.761), and best discrimination. The calibration curve showed that the RF model performed best among the seven machine learning models. DCA curve showed that the RF model exhibited greater net benefit as well as threshold probability compared to other models, indicating that the RF model was the best model with good clinical utility. The machine learning model can be used as a reliable tool for predicting in-hospital mortality in sepsis patients. RF models has the best predictive performance, which is helpful for clinicians to identify high-risk patients and implement early intervention to reduce mortality.

Full Text