Abstract

Fungemia is a life-threatening infection, but predictive models of in-patient mortality in this infection are few. In this study, we developed models predicting all-cause in-hospital mortality among 265 fungemic patients in the Medical Information Mart for Intensive Care (MIMIC-III) database using both structured and unstructured data. Structured data models included multivariable logistic regression, extreme gradient boosting, and stacked ensemble models. Unstructured data models were developed using Amazon Comprehend Medical and BioWordVec embeddings in logistic regression, convolutional neural networks (CNNs), and recurrent neural networks (RNNs). We evaluated models trained on all notes, notes from only the first three days of hospitalization, and models trained on only physician notes. The best-performing structured data model was a multivariable logistic regression model that achieved an accuracy of 0.74 and AUC of 0.76. Liver disease, acute renal failure, and intubation were some of the top features driving prediction in multiple models. CNNs using unstructured data achieved similar performance even when trained with notes from only the first three days of hospitalization. The best-performing unstructured data models used the Amazon Comprehend Medical document classifier and CNNs, achieving accuracy ranging from 0.99-1.00, and AUCs of 1.00. Therefore, unstructured data - particularly notes composed by physicians - offer added predictive value over models based on structured data alone.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call