Abstract
BackgroundAppropriate empirical treatment for candidemia is associated with reduced mortality; however, the timely diagnosis of candidemia in patients with sepsis remains poor.ObjectiveWe aimed to use machine learning algorithms to develop and validate a candidemia prediction model for patients with cancer.MethodsWe conducted a single-center retrospective study using the cancer registry of a tertiary academic hospital. Adult patients diagnosed with malignancies between January 2010 and December 2018 were included. Our study outcome was the prediction of candidemia events. A stratified undersampling method was used to extract control data for algorithm learning. Multiple models were developed—a combination of 4 variable groups and 5 algorithms (auto-machine learning, deep neural network, gradient boosting, logistic regression, and random forest). The model with the largest area under the receiver operating characteristic curve (AUROC) was selected as the Candida detection (CanDETEC) model after comparing its performance indexes with those of the Candida Score Model.ResultsFrom a total of 273,380 blood cultures from 186,404 registered patients with cancer, we extracted 501 records of candidemia events and 2000 records as control data. Performance among the different models varied (AUROC 0.771- 0.889), with all models demonstrating superior performance to that of the Candida Score (AUROC 0.677). The random forest model performed the best (AUROC 0.889, 95% CI 0.888-0.889); therefore, it was selected as the CanDETEC model.ConclusionsThe CanDETEC model predicted candidemia in patients with cancer with high discriminative power. This algorithm could be used for the timely diagnosis and appropriate empirical treatment of candidemia.
Highlights
Candidemia is a representative nosocomial bloodstream infection that contributes to the mortality of immunocompromised patients; it has been shown to occur in 3% of patients in intensive care and 20% of immunosuppressed patients [1]
The CanDETEC model predicted candidemia in patients with cancer with high discriminative power. This algorithm could be used for the timely diagnosis and appropriate empirical treatment of candidemia
Upon review of clinical-domain literature on candidemia risk factors, we identified 210 variables [2,13,14,25,26,27] that have been widely used in the development of machine learning algorithms in other clinical fields
Summary
Candidemia is a representative nosocomial bloodstream infection that contributes to the mortality of immunocompromised patients; it has been shown to occur in 3% of patients in intensive care and 20% of immunosuppressed patients [1]. Owing to a compromised immunity from chemotherapy or malignancy itself, patients with cancer have been reported as the most vulnerable hosts to candidemia [2,3,4]. Significant mortality has been reported over several decades. In studies from the 1980s, the mortality rate in patients with cancer found to have candidemia exceeded 50% [5,6,7]. High mortality rates, ranging from 30% to 51%, have been reported in studies after 2010 [3,4,8]. The mortality rate of candidemia was significantly higher than that of bacteremia [3]. Appropriate empirical treatment for candidemia is associated with reduced mortality; the timely diagnosis of candidemia in patients with sepsis remains poor
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have