Abstract

COVID-19 clinical presentation and prognosis are highly variable, ranging from asymptomatic and paucisymptomatic cases to acute respiratory distress syndrome and multi-organ involvement. We developed a hybrid machine learning/deep learning model to classify patients in two outcome categories, non-ICU and ICU (intensive care admission or death), using 558 patients admitted in a northern Italy hospital in February/May of 2020. A fully 3D patient-level CNN classifier on baseline CT images is used as feature extractor. Features extracted, alongside with laboratory and clinical data, are fed for selection in a Boruta algorithm with SHAP game theoretical values. A classifier is built on the reduced feature space using CatBoost gradient boosting algorithm and reaching a probabilistic AUC of 0.949 on holdout test set. The model aims to provide clinical decision support to medical doctors, with the probability score of belonging to an outcome class and with case-based SHAP interpretation of features importance.

Highlights

  • To date (May 2021), more than one hundred millions of individuals have been reported as affected by COVID-19

  • Radiological information is native as imaging data, while laboratory and clinical information comes in tabular form

  • We built a COVID-19 prognostic hybrid machine-learning/deep learning model intended to be usable as a tool that can support clinical decision making

Read more

Summary

Introduction

To date (May 2021), more than one hundred millions of individuals have been reported as affected by COVID-19. From the beginning of the infection, it was apparent that COVID-19 encompasses a wide spectrum of both clinical presentations and consequent prognosis, with cases of sudden, unexpected evolution (and worsening) of the clinical and radiological ­picture[1]. Such elements of variability and instability are still not fully explained, with an important role advocated for a multiplicity of pathophysiological p­ rocesses[2–4]. The Shapley values are a fair distribution of the payout between players, i.e. of the prediction result between features In this way, both synthetic (percentage score) and analytic (SHAP values) information are provided to the judgement of the clinician

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call