BackgroundIdentification of distinct clinical phenotypes of diseases can guide personalized treatment. This study aimed to classify hospitalized COVID-19 pneumonia subgroups using an unsupervised machine learning approach.MethodsWe included hospitalized COVID-19 pneumonia patients from July to September 2021. K-means clustering, an unsupervised machine learning method, was performed to identify clinical phenotypes based on clinical and laboratory variables collected within 24 hours of admission. Variables were normalized before clustering to ensure equal contribution to the analysis. The optimal number of clusters was determined using the elbow method and Silhouette scores. Cox proportional hazard models were used to compare the risk of intubation and 90-day mortality across the identified clusters.ResultsThree clinically distinct clusters were identified among 538 hospitalized COVID-19 pneumonia patients. Cluster 1 (N = 27) consisted predominantly of males and showed significantly elevated serum liver enzymes and LDH levels. Cluster 2 (N = 370) was characterized by lower chest x-ray scores and higher serum albumin levels. Cluster 3 (N = 141) was characterized by older age, diabetes mellitus, higher chest x-ray scores, more severe vital signs, higher creatinine levels, lower hemoglobin levels, lower lymphocyte counts, higher C-reactive protein, higher D-dimer, and higher LDH levels. When compared to cluster 2, cluster 3 was significantly associated with increased risk of 90-day mortality (HR, 6.24; 95% CI, 2.42–16.09) and intubation (HR, 5.26; 95% CI 2.37–11.72). In contrast, cluster 1 had a 100% survival rate with a non-significant increase in intubation risk compared to cluster 2 (HR, 1.40, 95% CI, 0.18–11.04).ConclusionsWe identified three distinct clinical phenotypes of COVID-19 pneumonia patients, with cluster 3 associated with an increased risk of respiratory failure and mortality. These findings may guide tailored clinical management strategies.
Read full abstract