Abstract

The COVID-19 pandemic is an ongoing pandemic of coronavirus disease since 2019. Millions of cases and deaths attributed to it have been confirmed in the world. So far the detection of COVID-19 heavily relies on the specialized tests (e.g., based on saliva or respiratory swabs). Some approaches use smart devices (e.g., Whoop) for coronavirus infection detection using respiratory rate. Machine learning (ML) techniques have become a promising approach for the coronavirus infection detection. Therefore, in this paper, we introduce a machine learning based COVID infection predictor. We measure the prediction accuracy of five ML models. We use Chi-square test and knowledge-based manual feature selection to select important features for prediction to reduce prediction time overhead without compromising prediction accuracy. We also study the accuracy with different input features (those that can be measured by medical devices and by smart devices) and find that removing some features has no or slight influence on the prediction accuracy. Since insufficient or unbalanced training data decreases the prediction accuracy, we further propose a Generative Adversarial Network (GAN) ML based predictor that produces synthetic data (close to real data) for ML training. Our extensive experiments show the effectiveness of our methods in improving the detection accuracy. Our study results can provide guidance on developing the coronavirus infection predictors based on different data sources and devices. We open sourced our code in GitHub.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call