PurposeThe length of stay (LoS) is one of the most used metrics for resource use in Intensive Care Units (ICU). We propose a structured data-driven methodology to predict the ICU length of stay and the risk of prolonged stay, and its application in a large multicentre Brazilian ICU database. MethodsDemographic data, comorbidities, complications, laboratory data, and primary and secondary diagnosis were prospectively collected and retrospectively analysed by a data-driven methodology, which includes eight different machine learning models and a stacking model. The study setting included 109 mixed-type ICUs from 38 Brazilian hospitals and the external validation was performed by 93 medical-surgical ICUs of 55 hospitals in Brazil. ResultsA cohort of 99,492 adult ICU admissions were included from the 1st of January to the 31st of December 2019. The stacking model combining Random Forests and Linear Regression presented the best results to predict ICU length of stay (RMSE = 3.82; MAE = 2.52; R² = 0.36). The prediction model for the risk of long stay were accurate to early identify prolonged stay patients (Brier Score = 0.04, AUC = 0.87, PPV = 0.83, NPV = 0.95). ConclusionThe data-driven methodology to predict ICU length of stay and the risk of long-stay proved accurate in a large multicentre cohort of general ICU patients. The proposed models are helpful to predict the individual length of stay and to early identify patients with high risk of prolonged stay.
Read full abstract