Abstract

Article Figures and data Abstract eLife digest Introduction Results Discussion Materials and methods Appendix 1 Data availability References Decision letter Author response Article and author information Metrics Abstract This study examined records of 2566 consecutive COVID-19 patients at five Massachusetts hospitals and sought to predict level-of-care requirements based on clinical and laboratory data. Several classification methods were applied and compared against standard pneumonia severity scores. The need for hospitalization, ICU care, and mechanical ventilation were predicted with a validation accuracy of 88%, 87%, and 86%, respectively. Pneumonia severity scores achieve respective accuracies of 73% and 74% for ICU care and ventilation. When predictions are limited to patients with more complex disease, the accuracy of the ICU and ventilation prediction models achieved accuracy of 83% and 82%, respectively. Vital signs, age, BMI, dyspnea, and comorbidities were the most important predictors of hospitalization. Opacities on chest imaging, age, admission vital signs and symptoms, male gender, admission laboratory results, and diabetes were the most important risk factors for ICU admission and mechanical ventilation. The factors identified collectively form a signature of the novel COVID-19 disease. eLife digest The new coronavirus (now named SARS-CoV-2) causing the disease pandemic in 2019 (COVID-19), has so far infected over 35 million people worldwide and killed more than 1 million. Most people with COVID-19 have no symptoms or only mild symptoms. But some become seriously ill and need hospitalization. The sickest are admitted to an Intensive Care Unit (ICU) and may need mechanical ventilation to help them breath. Being able to predict which patients with COVID-19 will become severely ill could help hospitals around the world manage the huge influx of patients caused by the pandemic and save lives. Now, Hao, Sotudian, Wang, Xu et al. show that computer models using artificial intelligence technology can help predict which COVID-19 patients will be hospitalized, admitted to the ICU, or need mechanical ventilation. Using data of 2,566 COVID-19 patients from five Massachusetts hospitals, Hao et al. created three separate models that can predict hospitalization, ICU admission, and the need for mechanical ventilation with more than 86% accuracy, based on patient characteristics, clinical symptoms, laboratory results and chest x-rays. Hao et al. found that the patients’ vital signs, age, obesity, difficulty breathing, and underlying diseases like diabetes, were the strongest predictors of the need for hospitalization. Being male, having diabetes, cloudy chest x-rays, and certain laboratory results were the most important risk factors for intensive care treatment and mechanical ventilation. Laboratory results suggesting tissue damage, severe inflammation or oxygen deprivation in the body's tissues were important warning signs of severe disease. The results provide a more detailed picture of the patients who are likely to suffer from severe forms of COVID-19. Using the predictive models may help physicians identify patients who appear okay but need closer monitoring and more aggressive treatment. The models may also help policy makers decide who needs workplace accommodations such as being allowed to work from home, which individuals may benefit from more frequent testing, and who should be prioritized for vaccination when a vaccine becomes available. Introduction As a result of the SARS-CoV-2 pandemic, many hospitals across the world have resorted to drastic measures: canceling elective procedures, switching to remote consultations, designating most beds to COVID-19, expanding Intensive Care Unit (ICU) capacity, and re-purposing doctors and nurses to support COVID-19 care. In the U.S., the CDC estimates more than 310,000 COVID-19 hospitalizations from March 1 to June 13, 2020 (CDC, 2020). Much of the modeling work related to the pandemic has focused on spread dynamics (Kucharski et al., 2020). Others have described patients who were hospitalized (Richardson et al., 2020) (n = 5700) and (Buckner et al., 2020) (n = 105), became critically ill (Gong et al., 2020) (n = 372), or succumbed to the disease (n = 1625 (Onder et al., 2020), n = 270 [Wu et al., 2020]). In data from the New York City, 14.2% required ICU treatment and 12.2% mechanical ventilation (Richardson et al., 2020). With such rates, the logistical and ethical implications of bed allocation and potential rationing of care delivery are immense (White and Lo, 2020). To date, while state- or country-level prognostication has developed to examine resource allocation at a mass scale, there is inadequate evidence based on a large cohort on accurate prediction of the disease progress at the individual patient level. A string of recent studies developed models to predict severe disease or mortality based on clinical and laboratory findings, for example (Yan et al., 2020) (n = 485), (Gong et al., 2020) (n = 372), (Bhargava et al., 2020) (n = 197), (Ji et al., 2020) (n = 208), and (Wang et al., 2020) (n = 296). In these studies, several variables such as Lactate Dehydrogenase (LDH) (Gong et al., 2020; Ji et al., 2020; Yan et al., 2020) and C-reactive protein (CRP) have been identified as important predictors. All of these studies considered relatively small cohorts and, with the exception of Bhargava et al., 2020, considered patients in China. Although it is believed that the virus remains the same around the globe, the physiologic response to the virus and the eventual course of disease depend on multiple other factors, many of them regional (e.g. population characteristics, hospital practices, prevalence of pre-existing conditions) and not applicable universally. Triage of adult patients with COVID-19 remains challenging with most evidence coming from expert recommendations; evidence-based methods based on larger U.S.-based cohorts have not been reported (Sprung et al., 2020). Leveraging data from five hospitals of the largest health care system in Massachusetts, we seek to develop personalized, interpretable predictive models of (i) hospitalization, (ii) ICU treatment, and (iii) mechanical ventilation, among SARS-CoV-2 positive patients. To develop these models, we developed a pipeline leveraging state-of-the-art Natural Language Processing (NLP) tools to extract information from the clinical reports for each patient, employing statistical feature selection methods to retain the most predictive features for each model, and adapting a host of advance machine learning-based classification methods to develop parsimonious (hence, easier to use and interpret) predictive models. We found that the more interpretable models can, for the most part, deliver similar predictive performance compared to more complex, ‘black-box’ models involving ensembles of many decision trees. Our results support our initial hypothesis that important clinical outcomes can be predicted with a high degree of accuracy upon the patient’s first presentation to the hospital using a relatively small number of features, which collectively compose a ‘signature’ of the novel COVID-19 disease. Results We extracted data for all patients (n = 2566) who had a positive RT-PCR SARS-CoV-2 test between March 4 and April 13, 2020 at five Massachusetts hospitals, included in the same health care system (Massachusetts General Hospital (MGH), Brigham and Women’s Hospital (BWH), Faulkner Hospital (FH), Newton-Wellesley Hospital (NWH), and North Shore Medical Center (NSM)). The study was approved by the pertinent Institutional Review Boards. Demographics, pre-hospital medications, and comorbidities were extracted for each patient based on the electronic medical record. Patient symptoms, vital signs, radiologic findings, and laboratory results were recorded at their first hospital presentation (either clinic or emergency department) before testing positive for SARS-CoV-2. A total of 164 features were extracted for each patient. ICU admission and mechanical ventilation were determined for each patient. Complete blood count values were considered as absolute counts. Representative statistics comparing hospitalized, ICU admitted, and mechanically ventilated patients are provided in Table A1 (Appendix). Table A2 (Appendix) reports how patients were distributed among the five hospitals. Among the 2566 patients with a positive test, 930 (36.2%) were hospitalized. Among the hospitalized, 273 (29.4% of the hospitalized) required ICU care of which 217 (79.5%) required mechanical ventilation. The mean age over all patients was 51.9 years (SD: 18.9 years) and 45.6% were male. Hospitalization The mean age of hospitalized patients was 62.3 years (SD: 18 years) and 55.3% were male. We employed linear and non-linear classification methods for predicting hospitalizations. Non-linear methods included random forests (RF) (Breiman, 2001) and XGBoost (Chen and Guestrin, 2016). Linear methods included support vector machines (SVM) (Cortes and Vapnik, 1995) and Logistic Regression (LR); each linear method used either ℓ1- or ℓ2-norm regularization and we report the best-performing flavor of each model. Results are reported in Table 1. We report the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) and the Weighted-F1 score, both computed out-of-sample (in a test set not used for training the model). As we detail under Methods, we used two validation strategies. The ‘Random’ strategy randomly split the patients into a training and a test set and was repeated five times; from these five splits we report the average and the standard deviation of the test performance. The ‘BWH’ strategy trained the models on MGH, FH, NWH, and NSM patients, and evaluated performance on BWH patients. Table 1 Hospitalization prediction model (test performance). The values inside the parentheses refer to the standard deviation of the corresponding metric. Random refers to test set results from the five random training/test splits. BWH refers to training on four other hospitals and testing on data from BWH. SVM-L1 and LR-L1 refer to the ℓ1-norm regularized SVM and LR models. For the parsimonious model, we list the LR coefficients of each variable (Coef), the correlation of the variable with the outcome (Y-corr), the mean of the variable (Y1-mean) in the positive class (hospitalized for this table), and the mean of the variable (Y0-mean) in the negative class (non-hospitalized). Binary Coef denotes the coefficient of the variables in the binarized model. We report the corresponding odds ratio (OR) and the 95% confidence intervals (CI). Thresholds used for the binarized model are provided in Appendix 1—table 5. AlgorithmAUCF1-weightedRandomBWHRandomBWHModels using all 106 featuresLR-L287.0% (1.7%)85.9%81.6% (1.3%)84.2%SVM-L187.0% (1.6%)85.8%81.5% (1.5%)83.9%XGBoost87.8% (1.9%)87.7%80.9% (1.8%)83.3%RF88.2% (1.6%)88.1%81.2% (1.1%)83.2%Models using 74 statistically selected featuresLR-L287.1% (1.7%)86.0%82.0% (1.3%)83.9%SVM-L187.1% (1.7%)85.8%82.0% (1.4%)84.0%XGBoost87.9% (1.9%)87.6%81.2% (1.9%)84.2%RF88.0% (1.7%)88.1%80.8% (1.7%)83.9%Parsimonious Model using 11 featuresLR-L283.4% (1.7%)83.7%78.7% (0.9%)81.0%SVM-L183.4% (1.7%)83.8%78.1% (1.1%)79.9%Variables for the Parsimonious ModelVariableCoefY1 meanY0 meanp-valueY-corrCoef binaryOROR 95% CISpO2 (%)−11.9095.4497.11<0.001−0.291.745.673.978.12Temperature10.3637.2137.06<0.0010.080.862.361.763.18Respiratory Rate7.2022.8220.83<0.0010.18−0.130.880.691.13Age5.1462.3146.02<0.0010.410.882.41.863.11Pulse4.6090.0990.4<0.001−0.010.72.011.492.71Diastolic BP−3.5673.0777.21<0.001−0.231.514.512.887.06Adrenal Insufficiency3.090.0130.001<0.0010.082.5813.141.57110.37BMI2.3031.3431.64<0.001−0.04−0.090.910.711.17Transplantation1.900.0230.002<0.0010.11.434.191.0416.87Dyspnea1.850.170.02<0.0010.2627.414.8511.32CKD1.550.140.02<0.0010.250.812.251.353.74Intercept−2.51 SpO2: oxygen saturation; BP: Blood pressure; BMI: Body Mass Index; CKD: Chronic Kidney Disease. The hospitalization models used symptoms, pre-existing medications, comorbidities, and patient demographics. Laboratory results and radiologic findings were not considered since these were not available for most non-hospitalized patients. Full models used all (106) variables retained after several pre-processing steps described in Materials and methods. Applying the statistical variable selection procedure described in the Appendix (specifically, eliminating variables with a p-value exceeding 0.05), yields a model with 74 variables. To provide a more parsimonious, highly interpretable, and easier to implement model, we used recursive feature elimination (see Appendix) to select a model with only 11 variables. The best model using the random validation approach has an AUC of 88% while the best parsimonious (linear) model has an AUC of 83%, being though easier to interpret and implement. Validation on the BWH patients yields an AUC of 84% for the parsimonious model. Table 1 also reports the 11 variables in the parsimonious LR model, including their LR coefficients, and a binarized version of this model as described in Materials and methods. The most important variables associated with hospitalization were: oxygen saturation, temperature, respiratory rate, age, pulse, blood pressure, a comorbidity of adrenal insufficiency, BMI, prior transplantation, dyspnea, and kidney disease. Additionally, we assessed the role of pre-existing ACE inhibitor (ACEI) and angiotensin receptor blocker (ARB) medications by adding these variables into the parsimonious binarized model, while controlling for additional relevant variables (hypertension, diabetes, and arrhythmia comorbidities and other hypertension medications). We found that while ARBs are not a factor, ACEIs reduce the odds of hospitalization by 3/4, on average, controlling for other important factors, such as age, hypertension, and related comorbidities associated with the use of these medications. ICU admission The mean age of ICU admitted patients was 63.3 years (SD: 15.1 years) and 63% were male. The ICU and ventilation prediction models used the features considered for the hospitalization, as well as laboratory results and radiologic findings. For these models, we excluded patients who required immediate ICU admission or ventilation (defined as within 4 hr from initial presentation). This was implemented in order to focus on patients where triaging is challenging and risk prediction would be beneficial. There were 2513 and 2525 patients remaining for the ICU and the mechanical ventilation prediction models, respectively. For the model including 2513 patients (Table 2), we first developed a model using all 130 variables retained after pre-processing, then employed statistical variable selection to retain 56 of the variables, and then applied recursive feature elimination with LR to select a parsimonious model which uses only 10 variables. The following variables were included: opacity observed in a chest scan, respiratory rate, age, fever, male gender, albumin, anion gap, oxygen saturation, LDH, and calcium. In addition, we generated a binarized version of the parsimonious model. The parsimonious model for all 2513 patients has an AUC of 86%, almost as high as the model with all 130 features. Table 2 ICU prediction model (test performance). Abbreviations are as in Table 1. Thresholds for the binarized model, PSI and CURB-65 scores are in the Appendix. ICU prediction results with 2513 patientsAlgorithmAUCF1-weightedRandomBWHRandomBWHModels using all 130 featuresXGBoost86.0% (2.8%)83.1%90.0% (1.7%)91.7%SVM-L185.9% (2.5%)80.2%89.9% (1.0%)89.2%LR-L184.6% (2.8%)76.8%89.7% (1.0%)89.9%RF86.9% (2.4%)83.7%90.4% (1.1%)91.1%Models using 56 statistically selected featuresXGBoost86.8% (3.1%)82.8%90.4% (1.4%)91.3%SVM-L186.2% (2.6%)82.6%90.6% (1.2%)90.8%LR-L185.8% (2.9%)81.8%90.2% (1.3%)91.3%RF86.7% (2.0%)83.2%90.5% (1.7%)91.5%Parsimonious Model using 10 featuresLR-L185.8% (2.6%)83.9%90.0% (1.4%)89.1%LR-L1 (binarized model)84.2% (2.2%)82.5%89.8% (1.1%)88.1%Model using PSI or CURB-65 scorePSI score72.9% (4.9%)78.8%86.8% (0.7%)88.2%CURB-65 score67.0% (5.0%)75.4%87.0% (0.5%)88.1%Variables for the parsimonious modelVariableCoefY1 meanY0 meanp-valueY-corrCoef binaryOROR 97.5% CIRadiology Opacities0.540.760.27<0.0010.301.414.082.835.89Respiratory Rate0.4624.6121.37<0.0010.160.501.661.142.41Age0.4562.6150.58<0.0010.180.561.761.272.43Fever0.400.640.33<0.0010.180.611.831.322.55Male0.350.640.44<0.0010.120.501.651.212.26Albumin−0.343.683.84<0.001−0.160.581.781.102.90Anion Gap0.3316.4015.35<0.0010.13−0.050.950.461.98SpO2 (%)−0.2294.7296.72<0.001−0.240.832.291.633.21LDH0.22400.40327.48<0.0010.150.962.621.743.94Calcium−0.218.849.01<0.001−0.100.551.731.212.48Intercept−0.93 SpO2: oxygen saturation; LDH: Lactate dehydrogenase. For comparison purposes against well-established scoring systems, we implemented two commonly used pneumonia severity scores, CURB-65 (Lim et al., 2003) and the Pneumonia Severity Index (PSI) (Fine et al., 1997). Predictions based on the PSI and CURB-65 scores, have AUCs of 73% and 67%, respectively. We also developed a model for a more restrictive set of patients. Specifically, the number of missing lab values for some patients is substantial. Given the importance of LDH and CRP, as revealed by our models, the more restricted patient set contains 669 patients with non-missing LDH and CRP values. After removing patients who required intubation or ICU admission within 4 hr of hospital presentation, we included 628 patients and 635 patients for the restricted ICU admission and ventilation models, respectively. The best restricted model for the 628 patients (Table 3) is the nonlinear XGBoost model using 29 statistically selected features with an AUC of 83%, with a linear parsimonious LR model close behind (AUC 80%). An RF model using all variables yields an AUC of 77% when tested on BWH data. PSI- and CURB-65 models have AUCs below 59%. Table 3 Restricted ICU prediction model (test performance). Abbreviations are as in Table 1. Thresholds for the binarized model, PSI and CURB-65 scores are in the Appendix. ICU prediction results with 628 patientsAlgorithmAUCF1-weightedRandomBWHRandomBWHModels using all 130 featuresXGBoost82.5% (1.9%)67.3%81.4% (0.7%)72.6%SVM-L177.8% (3.8%)72.8%79.7% (1.2%)73.6%LR-L175.9% (3.6%)69.7%79.2% (2.5%)73.7%RF80.9% (2.7%)76.9%78.8% (1.9%)73.6%Models using 29 statistically selected featuresXGBoost82.7% (2.7%)76.2%80.6% (2.1%)72.6%SVM-L177.9% (3.7%)73.1%78.5% (1.4%)73.6%LR-L178.4% (4.1%)71.5%79.5% (2.6%)74.4%RF82.1% (2.8%)74.1%79.0% (2.4%)75.4%Parsimonious Model using 8 featuresLR-L180.1% (2.9%)74.2%80.9% (2.1%)77.2%LR-L1 (binarized model)72.5% (5.4%)69.9%73.4% (2.8%)69.7%Model using PSI or CURB-65 scorePSI score58.8% (7.4%)68.3%66.7% (2.2%)65.3%CURB-65 score56.8% (4.5%)76.9%66.2% (1.5%)63.8%Variables for the parsimonious modelVariableCoefY1 meanY0 meanp-valueY-corrCoef binaryOROR 97.5% CILDH0.53519.88304.40<0.0010.151.594.882.658.99CRP (mg/L)0.47127.1767.43<0.0010.350.762.130.706.47Calcium−0.358.839.01<0.001−0.130.712.031.253.31IDDM0.300.250.120.0030.151.002.731.624.60SpO2 (%)−0.2994.1395.590.003−0.220.341.410.922.16Radiology Opacities0.250.880.71<0.0010.160.621.861.053.29Anion Gap0.2016.6615.28<0.0010.200.341.400.484.12Sodium−0.16136.13137.53<0.001−0.140.471.601.052.43Intercept−0.34 LDH: Lactate dehydrogenase; CRP: C-reactive protein; IDDM: Insulin-dependent diabetes mellitus; SpO2: oxygen saturation. Mechanical ventilation The mean age of patients requiring mechanical ventilation was 63.3 years (SD: 14.7 years) and 63.6% were male. Again, we excluded patients who were intubated within 4 hr of their hospital admission. For the model including 2525 patients (Table 4), we used statistical feature selection to select 55 variables, and recursive feature elimination with LR to select a parsimonious model with only eight variables. The following variables were included: lung opacities, albumin, fever, respiratory rate, glucose, male gender, LDH, and anion gap. In addition, we generated a binarized version of the parsimonious model. The best model for all 2525 patients was a nonlinear RF model using the 55 statistically selected variables and yielding an AUC of 86%. The best linear model was the parsimonious LR model with an AUC of 85%. PSI- and CURB-65 models yield AUCs of 74% and 67%, respectively. Table 4 Ventilation prediction model (test performance). Abbreviations are as in Table 1. Thresholds for the binarized model, PSI and CURB-65 scores are in the Appendix. Ventilation prediction results with 2525 patientsAlgorithmAUCF1-weightedRandomBWHRandomBWHModels using all 130 featuresXGBoost85.8% (4.0%)83.8%91.0% (0.4%)91.6%SVM-L182.6% (4.9%)83.8%90.9% (0.8%)91.6%LR-L180.7% (5.4%)81.7%90.4% (1.2%)91.4%RF85.7% (3.9%)83.7%91.2% (0.9%)91.8%Models using 55 statistically selected featuresXGBoost85.7% (3.3%)86.3%91.1% (0.6%)91.6%SVM-L183.9% (3.7%)84.8%90.9% (1.1%)91.7%LR-L183.3% (4.0%)83.9%90.8% (1.3%)91.4%RF86.4% (3.4%)86.7%91.4% (1.1%)91.3%Parsimonious Model using 8 featuresLR-L185.2% (2.3%)87.0%90.3% (0.3%)90.7%LR-L1 (binarized model)81.3% (3.1%)82.6%90.0% (0.6%)90.2%Model using PSI or CURB-65 scorePSI score73.6% (4.1%)80.7%89.4% (0.4%)90.3%CURB-65 score66.8% (3.1%)75.9%89.7% (0.1%)90.0%Variables for the Parsimonious ModelVariableCoefY1 meanY0 meanp-valueY-corrCoef binaryOROR 97.5% CIRadiology opacities0.860.770.28<0.0010.271.584.863.257.25Albumin−0.453.653.83<0.001−0.161.072.911.804.72Fever0.430.660.33<0.0010.170.722.051.422.95Respiratory rate0.4224.7021.44<0.0010.150.501.641.092.47Glucose0.38170.17138.32<0.0010.150.972.631.714.06Male0.340.640.44<0.0010.100.431.541.092.18LDH0.33408.56328.78<0.0010.140.912.481.583.89Anion gap0.3116.5015.37<0.0010.130.271.310.533.25Intercept−1.06 LDH: Lactate dehydrogenase. The best model for the restricted case of 635 patients (Table 5) was the linear parsimonious LR model (with just five variables) achieving an AUC of 82%. PSI- and CURB-65 models do not exceed AUC of 58%. Table 5 Restricted ventilation prediction model (test performance). Abbreviations are as in Table 1.Thresholds for the binarized, PSI and CURB-65 scores are in the Appendix. Ventilation prediction results with 635 patientsAlgorithmAUCF1-weightedRandomBWHRandomBWHModels using all 130 featuresXGBoost80.6% (1.9%)74.7%79.4% (2.6%)75.7%SVM-L179.4% (5.2%)71.3%80.8% (2.0%)75.7%LR-L176.9% (3.9%)68.2%78.6% (3.2%)73.4%RF81.0% (3.1%)75.8%79.8% (4.2%)72.7%Models using 29 statistically selected featuresXGBoost81.6% (3.2%)76.9%79.0% (2.9%)71.7%SVM-L179.1% (4.6%)69.4%80.6% (2.5%)75.7%LR-L180.9% (3.6%)70.9%80.4% (2.2%)75.7%RF81.3% (2.6%)75.4%79.2% (1.7%)69.6%Parsimonious Model using 5 featuresLR-L182.4% (3.7%)75.2%81.8% (1.7%)71.7%LR-L1 (binarized model)71.4% (6.2%)65.5%76.6% (3.5%)68.3%Model using PSI or CURB-65 scorePSI score57.6% (4.5%)67.4%73.2% (1.3%)71.2%CURB-65 score56.9% (7.1%)74.0%72.4% (0.2%)68.3%Variables for the parsimonious modelVariableCoefY1 meanY0 meanp-valueY-corrCoef binaryOROR 97.5% CICRP (mg/L)0.60134.5269.62<0.0010.350.421.530.514.59LDH0.55550.41311.01<0.0010.161.876.473.1913.10Calcium−0.398.829.00<0.001−0.130.581.791.072.98IDDM0.360.260.120.0020.151.183.261.905.58Anion Gap0.2916.8115.32<0.0010.1918.661.27E+080.00infIntercept−0.39 CRP: C-reactive protein; LDH: Lactate dehydrogenase; IDDM: Insulin-dependent diabetes mellitus. Time period between ICU/ventilation model prediction and corresponding outcomes Table 6 reports the mean and the median time interval (in hours) between hospital admission time and ICU/ventilation outcomes. Specifically, we report statistics for ICU admission or intubation outcomes from the correct ICU/intubation predictions made by our models trained on four hospitals (MGH, NWH, NSM, FH) and applied to BWH patients (both the models making predictions for all patients and the restricted models). As we have noted earlier, our models use the lab results closest to admission (either on admission date or the following day). We also report the time interval between the last lab result used by the model and the corresponding ICU/intubation outcome. Table 6 Mean and median hours between reference date/lab results to outcomes in full/restricted ICU and ventilation model prediction. From reference date (mean)From reference date (median)From lab results (mean)From lab results (median)Restricted ICU38.1328.0822.559.90Restricted intubation35.3626.4022.3710.39Full ICU22.8617.2815.8612.99Full intubation25.6222.2010.238.97 Discussion We developed three models to predict need for hospitalization, ICU admission, and mechanical ventilation in patients with COVID-19. The prediction models are not meant to replace clinicians’ judgment for determining level of care. Instead, they are designed to assist clinicians in identifying patients at risk of future decompensation. Patient vital signs were the most important predictors of hospitalization. This is expected as vital signs reflect underlying disease severity, the need for cardiorespiratory resuscitation, and the risk of future decompensation without adequate medical support. Older age and BMI were also important predictors for hospitalization. Age has been recognized as an important factor associated with severe COVID-19 in previous series (Grasselli et al., 2020; Guan et al., 2020; Richardson et al., 2020). However, it is not known whether age itself or the presence of comorbidities place patients at risk for severe disease. Our results demonstrate that age is a stronger predictor of severe COVID-19 than a host of underlying comorbidities. In terms of patient comorbidities, adrenal insufficiency, prior transplantation, and chronic kidney disease were strongly associated with need for hospitalization. Diabetes mellitus was associated with a need for ICU admission and mechanical ventilation, which might be due to its detrimental effects on immune function. For the ICU and ventilation prediction models screening all at-risk (COVID-19-positive patients), opacities observed in a chest scan, age, and male gender emerge as important variables. Males have been found to have worse in-hospital outcomes in other studies as well (Palaiodimos et al., 2020). We also identified several routine laboratory values that are predictive of ICU admission and mechanical ventilation. Elevated serum LDH, CRP, anion gap, and glucose, as well as decreased serum calcium, sodium, and albumin were strong predictors of ICU admission and mechanical ventilation. LDH is an indicator of tissue damage and has been found to be a marker of severity in P. jirovecii pneumonia (Zaman and White, 1988). Along with CRP, it was among the two most important predictors of ICU admission and ventilation in the parsimonious model among patients who had LDH and CRP measurements on admission. This finding is consistent with previous reports identifying LDH as an important prognostic factor (Gong et al., 2020; Ji et al., 2020; Mo et al., 2020; Yan et al., 2020). In addition, lower serum calcium is associated with cell lysis and tissue destruction, as it is often seen as part of the tumor lysis syndrome. Elevated serum anion gap is a marker of metabolic acidosis and ischemia, suggesting that tissue hypoxia and hypoperfusion may be components of severe disease. For all three prognostic models, we developed predicting hospitalizations, ICU care, and mechanical ventilation, AUC ranges within 86–88%, which indicates strong predictive power. Interestingly, we can achieve AUC within 85–86% for ICU and ventilation prediction with a parsimonious linear model utilizing no more than 10 variables. In all cases, we can also develop a parsimonious model with binarized variables using medically suggested normal and abnormal variable thresholds. These binarized models have similar performance with their continuous counterparts. The ICU and ventilation models using all patients are very accurate, but, arguably, make a number of ‘easier’ decisions since more than 60% of the patients are never hospitalized. Many of these patients are younger, healthy, and likely present with mild-to-moderate symptoms. To test the robustness of the models to patients with potentially more ‘complex’ disease, we developed ICU and ventilation models on a restricted set of patients. This is the subset of patients who are hospitalized and most of the crucial labs are available for them (specifically CRP and LDH which emerged as important from our models). The best AUC for these models drops, but not below 82%, which indicates robustness of the model even when dealing with arguably harder to assess cases. LDH, CRP, calcium, lung opacity, anion gap, SpO2, sodium, and a comorbidity of insulin-controlled diabetes appear as the most significant for these patients. Interestingly, the corresponding binarized models have about 10% lower AUC; apparently, for the more severely ill, clinical variables deviate substantially from normal and knowing the exact values is crucial. The models have been validated with two different approaches, using random splits of the data into training and testing, as well as training in some hospitals and testing at a different hospital. Performance metrics are relatively consistent with these two approaches. We also compared the models against standard pneumonia severity scores, PSI and CURB-65, establishing that our models are significantly stronger, which highlights the different clinical profile of COVID-19. We also examined how much in advance of the ICU or ventilation outcomes our models are able to make a prediction. Of course, this is not entirely in our control; it depends on what state the patients get admitted an

Highlights

  • As a result of the SARS-CoV-2 pandemic, many hospitals across the world have resorted to drastic measures: canceling elective procedures, switching to remote consultations, designating most beds to COVID-19, expanding Intensive Care Unit (ICU) capacity, and re-purposing doctors and nurses to support COVID-19 care

  • ICU admission and mechanical ventilation were determined for each patient

  • We report the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) and the Weighted-F1 score, both computed out-of-sample

Read more

Summary

Introduction

As a result of the SARS-CoV-2 pandemic, many hospitals across the world have resorted to drastic measures: canceling elective procedures, switching to remote consultations, designating most beds to COVID-19, expanding Intensive Care Unit (ICU) capacity, and re-purposing doctors and nurses to support COVID-19 care. A string of recent studies developed models to predict severe disease or mortality based on clinical and laboratory findings, for example (Yan et al, 2020) (n = 485), (Gong et al, 2020) (n = 372), (Bhargava et al, 2020) (n = 197), (Ji et al, 2020) (n = 208), and (Wang et al, 2020) (n = 296). In these studies, several variables such as Lactate Dehydrogenase (LDH) (Gong et al, 2020; Ji et al, 2020; Yan et al, 2020) and C-reactive protein (CRP) have been identified as important predictors.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call