Predicting the COVID-19 mortality among Iranian patients using tree-based models: A cross-sectional study.

Amirhossein Aghakhani,Mir Saeed Yekaninejad,Zahra Karimi,Hojjat Zeraati,Fardis Vosoughi,Jaleh Shoshtarian Malak

doi:10.1002/hsr2.1279

Amirhossein Aghakhani, Mir Saeed Yekaninejad + Show 4 more

Open Access

https://doi.org/10.1002/hsr2.1279

Copy DOI

Abstract

To explore the use of different machine learning models in prediction of COVID-19 mortality in hospitalized patients. A total of 44,112 patients from six academic hospitals who were admitted for COVID-19 between March 2020 and August 2021 were included in this study. Variables were obtained from their electronic medical records. Random forest-recursive feature elimination was used to select key features. Decision tree, random forest, LightGBM, and XGBoost model were developed. Sensitivity, specificity, accuracy, F-1 score, and receiver operating characteristic(ROC)-AUC were used to compare the prediction performance of different models. Random forest-recursive feature eliminationselected following features to include in the prediction model: Age, sex, hypertension, malignancy, pneumonia, cardiac problem, cough, dyspnea, and respiratory system disease. XGBoost and LightGBM showed the best performance with an ROC-AUC of 0.83 [0.822-0.842] and 0.83 [0.816-0.837] and sensitivity of 0.77. XGBoost, LightGBM, and random forest have a relatively high predictive performance in prediction of mortality in COVID-19 patients and can be applied in hospital settings, however, future research are needed to externally confirm the validation of these models.

Full Text