Sepsis is a significant health burden on a global scale. Timely identification and treatment of sepsis can greatly improve patient outcomes, including survival rates. However, time-consuming laboratory results are often needed for screening sepsis. We aimed to develop a quick sepsis screening tool (qSepsis) based on patients' non-laboratory clinical data at the emergency department (ED) using machine learning (ML), and compare its performance with established clinical scores. This retrospective study included patients admitted to the ED of Zhongnan Hospital of Wuhan University (Wuhan, China) from 1/1/2015 to 5/31/2022. Patients who were under 18 years of age, had cardiopulmonary arrest upon arrival at the ED, or had missing and abnormal medical record data were excluded. The qSepsis was derived by three ML algorithms, including logistic regression (LR), random forest (RF), and extreme gradient boosting (XGB). To benchmark the existing clinical tools for assessing the risk of sepsis in the ED, qSepsis was compared with the Systemic Inflammatory Response Syndrome (SIRS), the Quick Sepsis-Related Organ Failure Assessment (qSOFA), and the Modified Early Warning Score (MEWS). The external validation was performed with the Medical Information Mart for Intensive Care IV ED database (United States), and adopted the same inclusion and exclusion criteria. The predictive power of qSepsis and other clinical scores was measured using the area under the receiver operating characteristic curve (AUROC). The primary outcome of the study was the diagnosis of sepsis in the ED based on the Sepsis 3.0 criteria, which served as the basis for developing the qSepsis tool. A total of 414,864 patients were finally included in the cohort (median ([IQR]) patient age, 43 (29, 60) years; 202,730 (48.87%) females, 212,134 (51.13%) males), and 200,089 in the external testing cohort (median (SD) patient age, 57 (39, 71) years; 107,427 (53.69%) females, 92,663 (46.31%) males). For internal testing, LR outperformed RF and XGB with an AUROC of 0.862 (95% CI, 0.855-0.869). In external testing, the AUROC decreased to 0.766 (95% CI, 0.758-0.774) for LR, 0.725 (95% CI, 0.717-0.733) for RF, and 0.735 (95% CI, 0.728-0.742) for XGB. In addition, the AUROC for the qSOFA, MEWS, and SIRS scores in external validation cohort were 0.579 (95% CI, 0.563-0.596), 0.600 (95% CI, 0.578-0.622), and 0.704 (95% CI, 0.683-0.725), respectively. The area under the precision-recall curve (AUPRC) for the qSepsis model was 0.213 (95% CI: 0.204-0.222). The AUPRC values for the other scores were as follows: SIRS, 0.071 (95% CI: 0.013-0.099); qSOFA, 0.096 (95% CI: 0.003-0.186); and MEWS, 0.083 (95% CI: 0.063-0.111). This retrospective study demonstrated that qSepsis had better predictive performance in terms of AUROC and area under the precision-recall curve (AUPRC) compared to existing assessment scores. It has the potential to be used in pre-hospital settings with limited access to laboratory tests and in the ED for quick screening of patients with sepsis. However, due to its low positive predictive value (PPV), the false alarms may increase in actual clinical practice. Transformation of Scientific and Technological Achievements Fund Project of Zhongnan Hospital of Wuhan University.
Read full abstract