The novel coronavirus pandemic continues to cause significant morbidity and mortality around the world. Diverse clinical presentations prompted numerous attempts to predict disease severity to improve care and patient outcomes. Equally important is understanding the mechanisms underlying such divergent disease outcomes. Multivariate modeling was used here to define the most distinctive features that separate COVID-19 from healthy controls and severe from moderate disease. Using discriminant analysis and binary logistic regression models we could distinguish between severe disease, moderate disease, and control with rates of correct classifications ranging from 71 to 100%. The distinction of severe and moderate disease was most reliant on the depletion of natural killer cells and activated class-switched memory B cells, increased frequency of neutrophils, and decreased expression of the activation marker HLA-DR on monocytes in patients with severe disease. An increased frequency of activated class-switched memory B cells and activated neutrophils was seen in moderate compared to severe disease and control. Our results suggest that natural killer cells, activated class-switched memory B cells, and activated neutrophils are important for protection against severe disease. We show that binary logistic regression was superior to discriminant analysis by attaining higher rates of correct classification based on immune profiles. We discuss the utility of these multivariate techniques in biomedical sciences, contrast their mathematical basis and limitations, and propose strategies to overcome such limitations.
Read full abstract