Abstract

We conducted a systematic survey of COVID-19 endpoint prediction literature to: (a) identify publications that include data that adhere to FAIR (findability, accessibility, interoperability, and reusability) principles and (b) develop and reuse mortality prediction models that best generalize to these datasets. The largest such cohort data we knew of was used for model development. The associated published prediction model was subjected to recursive feature elimination to find a minimal logistic regression model which had statistically and clinically indistinguishable predictive performance. This model could still not be applied to the four external validation sets that were identified, due to complete absence of needed model features in some external sets. Thus, a generalizable model (GM) was built which could be applied to all four external validation sets. An age-only model was used as a benchmark, as it is the simplest, effective, and robust predictor of mortality currently known in COVID-19 literature. While the GM surpassed the age-only model in three external cohorts, for the fourth external cohort, there was no statistically significant difference. This study underscores: (1) the paucity of FAIR data being shared by researchers despite the glut of COVID-19 prediction models and (2) the difficulty of creating any model that consistently outperforms an age-only model due to the cohort diversity of available datasets.

Highlights

  • We conducted a systematic survey of COVID-19 endpoint prediction literature to: (a) identify publications that include data that adhere to FAIR principles and (b) develop and reuse mortality prediction models that best generalize to these datasets

  • COVID-19 has a psychological impact, with various groups of people in society being at risk of developing anxiety or stress as a result of quarantine and, in the case of healthcare workers, a changed work dynamic [3]

  • Of the 168 articles summarized from the review paper of Wynants et al [6], 111 did not have any data availability statement

Read more

Summary

Introduction

We conducted a systematic survey of COVID-19 endpoint prediction literature to: (a) identify publications that include data that adhere to FAIR (findability, accessibility, interoperability, and reusability) principles and (b) develop and reuse mortality prediction models that best generalize to these datasets. The largest such cohort data we knew of was used for model development. The associated published prediction model was subjected to recursive feature elimination to find a minimal logistic regression model which had statistically and clinically indistinguishable predictive performance This model could still not be applied to the four external validation sets that were identified, due to complete absence of needed model features in some external sets. COVID-19 has a psychological impact, with various groups of people in society being at risk of developing anxiety or stress as a result of quarantine and, in the case of healthcare workers, a changed work dynamic [3]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call