Background:Acute Respiratory Distress Syndrome (ARDS) is a life-threatening complication of COVID-19 and has been reported in approximately one-third of hospitalized patients with COVID-191. Risk factors associated with the development of ARDS include older age and diabetes2. However, little is known about factors associated with ARDS in the setting of COVID-19, in patients with rheumatic disease or those receiving immunosuppressive medications. Prediction algorithms using traditional regression methods perform poorly with rare outcomes, often yielding high specificity but very low sensitivity. Machine learning algorithms optimized for rare events are an alternative approach with potentially improved sensitivity for rare events, such as ARDS in COVID-19 among patients with rheumatic disease.Objectives:We aimed to develop a prediction model for ARDS in people with COVID-19 and pre-existing rheumatic disease using a series of machine learning algorithms and to identify risk factors associated with ARDS in this population.Methods:We used data from the COVID-19 Global Rheumatology Alliance (GRA) Registry from March 24 to Nov 1, 2020. ARDS diagnosis was indicated by the reporting clinician. Five machine learning algorithms optimized for rare events predicted ARDS using 42 variables covering patient demographics, rheumatic disease diagnoses, medications used at the time of COVID-19 diagnosis, and comorbidities. Model performance was assessed using accuracy, area under curve, sensitivity, specificity, positive predictive value, and negative predictive value. Adjusted odds ratios corresponding to the 10 most influential predictors from the best performing model were derived using hierarchical multivariate mixed-effects logistic regression that accounted for within-country correlations.Results:A total of 5,931 COVID-19 cases from 67 countries were included in the analysis. Mean (SD) age was 54.9 (16.0) years, 4,152 (70.0%) were female, and 2,399 (40.5%) were hospitalized. ARDS was reported in 388 (6.5% of total and 15.6% of hospitalized) cases. Statistically significant differences in the risk of ARDS were observed by demographics, diagnoses, medications, and comorbidities using unadjusted univariate comparisons (data not shown). Gradient boosting machine (GBM) had the highest sensitivity (0.81) and was considered the best performing model (Table 1). Hypertension, interstitial lung disease, kidney disease, diabetes, older age, glucocorticoids, and anti-CD20 monoclonal antibodies were associated with the development of ARDS while tumor necrosis factor inhibitors were associated with a protective effect (Figure 1).Table 1.Performance of machine learning algorithms.GBMSVMGLMNETNNETRFAccuracy0.790.680.660.660.67AUC0.750.700.740.580.74Sensitivity0.810.680.650.680.67Specificity0.490.600.730.480.68PPV0.960.960.970.950.97NPV0.160.120.130.090.13GBM: Gradient Boosting Machine, SVM: Support vector machines, GLMNET: Lasso and Elastic-Net Regularized Generalized Linear Models, NNET: Neural Networks, RF: Random Forest. AUC: Area Under Curve; PPV: Positive Predictive Value; NPV: Negative Predictive Value.Conclusion:In this global cohort of patients with rheumatic disease, a machine learning model, GBM, predicted the onset of ARDS with 81% sensitivity using baseline information obtained at the time of COVID-19 diagnosis. These results identify patients who may be at higher risk of severe COVID-19 outcomes. Further studies are necessary to validate the proposed prediction model in external cohorts and to evaluate its clinical utility. Disclaimer: The views expressed here are those of the authors and participating members of the COVID-19 Global Rheumatology Alliance, and do not necessarily represent the views of the ACR, NIH, (UK) NHS, NIHR, or the department of Health.
Read full abstract