The coronavirus disease 2019 (COVID-19) became the most spread and lethal disease in the last 3 years. Early predictions could optimize the decision-making process, healthcare outcomes, and effective usage of healthcare resources during peaks. This study set out to predict the mortality risk of COVID-19 patients by investigating 14 machine learning (ML) models using extensive clinical, laboratory, and image-based features. Additionally, feature importances in each model and the influences of features in the mortality prediction of ML models have been evaluated in this study. Data from 252 patients during the 5th peak of the COVID-19 pandemic (July 2021-September 2021) with 42 features were used for the training of ML models. Fourteen ML models were created using the fivefold cross-validation method. Each model was trained using a training-validation dataset with its own optimized parameters. The performance of models has been evaluated by metric parameters of accuracy, precision, sensitivity, specificity, AUC, and F1 score. The highest values of accuracy (87.30%), precision (100%), sensitivity (77.27%), specificity (100%), AUC (91.90%), and F1 score (77.99%) were observed for the linear discriminant analysis (LDA), K-Nearest Neighbors (KNN), Gaussian Naive Bayes (GNB), KNN, Passive Aggressive Classifier (PAC), and LDA models, respectively, when training was performed with all 42 features. By using feature selection techniques, the support vector classifier (SVC) model with 10 features showed the most AUC of 93.40%. The features of mechanical ventilation, consolidation, fatigue, malignancy, dry cough, level of consciousness (LOC), gender, diarrhea, O2 therapy, and SpO2 are potential predictors of mortality rates in COVID-19 patients.
Read full abstract