Introduction/Background: Cardiovascular symptoms appear in a high proportion of patients in the few months following a severe SARS-CoV-2 infection. Non-invasive methods to predict disease severity could help personalizing healthcare and reducing the occurrence of these symptoms. Research Questions/Hypothesis: We hypothesized that blood long noncoding RNAs (lncRNAs) and machine learning (ML) could help predict COVID-19 severity. Goals/Aims: To develop a model based on lncRNAs and ML for predicting COVID-19 severity. Methods/Approach: Expression data of 2906 lncRNAs were obtained by targeted sequencing in plasma samples collected at baseline from four independent cohorts, totaling 564 COVID-19 patients. Patients were aged 18+ and were recruited from 2020 to 2023 in the PrediCOVID cohort (n=162; Luxembourg), the COVID19_OMICS-COVIRNA cohort (n=100, Italy), the TOCOVID cohort (n=233, Spain), and the MiRCOVID cohort (n=69, Germany). The study complied with the Declaration of Helsinki. Cohorts were approved by ethics committees and patients signed an informed consent. Results/Data: After data curation and pre-processing, 463 complete datasets were included in further analysis, representing 101 severe patients (in-hospital death or ICU admission) and 362 stable patients (no hospital admission or hospital admission but not ICU). Feature selection with Boruta, a random forest-based method, identified age and five lncRNAs (LINC01088-201, FGDP-AS1, LINC01088-209, AKAP13, and a novel lncRNA) associated with disease severity, which were used to build predictive models using six ML algorithms. A naïve Bayes model based on age and five lncRNAs predicted disease severity with an AUC of 0.875 [0.868-0.881] and an accuracy of 0.783 [0.775-0.791]. Conclusion: We developed a ML model including age and five lncRNAs predicting COVID-19 severity. This model could help improve patients’ management and cardiovascular outcomes.
Read full abstract