This study aimed to preliminarily develop machine learning (ML) models capable of predicting the risk of device-associated infection and 30-day outcomes following invasive device procedures in intensive care unit (ICU) patients. The study utilized data from 8574 ICU patients who underwent invasive procedures, sourced from the Medical Information Mart for Intensive Care (MIMIC)-IV version 2.2 database. Patients were allocated into training and validation datasets in a 7:3 ratio. Seven ML models were employed for predicting device-associated infections, while five models were used for predicting 30-day survival outcomes. Model performance was primarily evaluated using the receiver operating characteristic (ROC) curve for infection prediction and the survival model’s concordance index (C-index). Top-performing models progressively reduced the number of variables based on their importance, thereby optimizing practical utility. The inclusion of all variables demonstrated that extreme gradient boosting (XGBoost) and extra survival trees (EST) models yielded superior discriminatory performance. Notably, when restricted to the top 10 variables, both models maintained performance levels comparable to when all variables were included. In the validation cohort, the XGBoost model, with the top 10 variables, achieved an area under the curve (AUC) of 0.810 (95% CI 0.808–0.812), an area under the precision-recall curve (AUPRC) of 0.226 (95% CI 0.222–0.230), and a Brier score (BS) of 0.053 (95% CI 0.053–0.054). The EST model, with the top 10 variables, reported a C-index of 0.756 (95% CI 0.754–0.757), a time-dependent AUC of 0.759 (95% CI 0.763–0.775), and an integrated Brier score (IBS) of 0.087 (95% CI 0.087–0.087). Both models are accessible via a web application. The internally evaluated XGBoost and EST models demonstrated exceptional predictive accuracy for device-associated infection risks and 30-day survival outcomes post-invasive procedures in ICU patients. Further validation is required to confirm the clinical utility of these two models in future studies.
Read full abstract