Abstract

There is significant heterogeneity in disease progression among hospitalized COVID-19 patients. The pathogenesis of SARS-CoV-2 infection is attributed to a complex interplay between virus and host immune response that in some patients unpredictably and rapidly leads to "hyperinflammation" associated with increased risk of mortality. Early identification of patients at risk of progression to hyperinflammation may help inform timely therapeutic decisions and lead to improved outcomes. The primary objective of this study is to use machine learning to reproducibly identify specific risk stratifying clinical phenotypes across hospitalized COVID-19 patients and compare treatment response characteristics and outcomes. A secondary objective is to derive a predictive phenotype classification model using routinely available early encounter data that may be useful in informing optimal COVID-19 bedside clinical management. This is a retrospective analysis of electronic health record (EHR) data of adult patients (N= 4379) that were admitted to a Johns Hopkins Health System Hospital for COVID-19 treatment in the 2020-2021 timeframe. Phenotypes were identified by clustering 38 routine clinical observations recorded during inpatient care. To examine the reproducibility/validity of derived phenotypes, patient data was randomly divided into two cohorts and clustering analysis was performed independently on each cohort. A predictive phenotype classifier using the Gradient Boosting Machine (GBM) method was derived using routine clinical observations recorded during the first 6 hours following admission. Two phenotypes (designated as P1 and P2) were identified in patients admitted for COVID-19 in both the training and validation cohorts with similar distributions of features, correlations with biomarkers, treatments, comorbidities, and outcomes. In both training and validation cohorts, P2 patients were older, had elevated markers of inflammation and were at an increased risk of requiring ICU-level care, developing sepsis, and mortality, compared to P1 patients. The GBM phenotype predictive model yielded an area under the curve (AUC) of 0.89 and a positive predictive value (PPV) of 0.83. Using machine learning clustering we identified and internally validated two clinical COVID-19 phenotypes with distinct treatment/response characteristics consistent with similar two-phenotype models derived in other hospitalized COVID-19 populations, supporting the reliability and generalizability of these findings. COVID-19 phenotypes can be accurately identified using machine learning models based on readily available early encounter clinical data. A phenotype predictive model based on early encounter data may be clinically useful for timely bedside risk stratification and treatment personalization. . Not applicable.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call