Abstract

COVID-19 pandemic has flooded all triage stations, making it difficult to carefully select those most likely infected. Data on total patients tested, infected, and hospitalized is fragmentary making it difficult to easily select those most likely to be infected. The Israeli Ministry of Health made public its registry of immediate clinical data and the respective status of infected/not infected for all viral DNA tests performed up to Apr. 18th, 2020 including almost 120,000 tests. We used a machine-learning algorithm to find out which immediate clinical elements mattered the most in identifying the true status of the tested persons including age or gender matter, to enable future better allocation of surveillance policy for those belonging to high-risk groups. In addition to the analyses applied on the first batch of the available data (Apr. 11th), we further tested the algorithm on the independent second batch (Apr. 12th to 18th). Fever, cough and headache were the most diagnostic, differing in degree of importance in different subgroups. Higher percentage of men were found positive (9.3 vs. 7.3%), but gender did not matter for the clinical presentation. The prediction power of the model was high, with accuracy of 0.84 and area under the curve 0.92. We provide a hand-held short checklist with verbal description of importance for the leading symptoms, which should expedite the triage and enable proper selection of people for further follow-up.

Highlights

  • COVID-19 pandemic has flooded triage stations with people seeking to verify whether they are infected

  • We used a machine-learning algorithm to find out which immediate clinical elements mattered the most in identifying the true status of the tested persons including age or gender matter, to enable future better allocation of surveillance policy for those belonging to high-risk groups

  • For our current study we have considered the dataset for the people tested for COVID_19 viral DNA

Read more

Summary

Introduction

COVID-19 pandemic has flooded triage stations with people seeking to verify whether they are infected. Zero row is a row that consist of just zero values Our assumption that this kind of test is wrongly assigned to be infected samples, or it might be that the reason of infection is not determine by the current set of features used in the data. The new batch consist of 16,156 tests where 499 are positive and 15,657 are negative This includes, molecular databases, information systems and data warehouses, integration of data (methods and tools), metabolic and regulatory network modeling and simulation, signal pathways and cell control, network analysis, medical informatics, biomedicine and biotechnology, integrative approaches for drug design as well as integrative data and text mining approaches

Training and evaluation of the model
Results
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call