Abstract

This work develops a robust classifier for a COVID-19 pre-screening model from crowdsourced cough sound data. The crowdsourced cough recordings contain a variable number of coughs, with some input sound files more informative than the others. Accurate detection of COVID-19 from the sound datasets requires overcoming two main challenges (i) the variable number of coughs in each recording and (ii) the low number of COVID-positive cases compared to healthy coughs in the data. We use two open datasets of crowdsourced cough recordings and segment each cough recording into non-overlapping coughs. The segmentation enriches the original data without oversampling by splitting the original cough sound files into non-overlapping segments. Splitting the sound files enables us to increase the samples of the minority class (COVID-19) without changing the feature distribution of the COVID-19 samples resulted from applying oversampling techniques. Each cough sound segment is transformed into six image representations for further analyses. We conduct extensive experiments with shallow machine learning, Convolutional Neural Network (CNN), and pre-trained CNN models. The results of our models were compared to other recently published papers that apply machine learning to cough sound data for COVID-19 detection. Our method demonstrated a high performance using an ensemble model on the testing dataset with area under receiver operating characteristics curve = 0.77, precision = 0.80, recall = 0.71, F1 measure = 0.75, and Kappa = 0.53. The results show an improvement in the prediction accuracy of our COVID-19 pre-screening model compared to the other models.

Highlights

  • This work develops a robust classifier for a COVID-19 pre-screening model from crowdsourced cough sound data

  • The results showed that the Convolutional Neural Network (CNN) model achieved COVID-19 sensitivity of 98.5% with a specificity of 94.2% (AUC: 0.97)

  • The main challenge faced in this work is how to utilize a crowdsourced cough dataset with diverse length, pacing, number of coughs, and stochastic background noise from publicly available COVID-19 cough sounds

Read more

Summary

Introduction

This work develops a robust classifier for a COVID-19 pre-screening model from crowdsourced cough sound data. The results of our models were compared to other recently published papers that apply machine learning to cough sound data for COVID-19 detection. Sample collection with the NP swab is an invasive method and is not ideal for screening, prognostics, and longitudinal monitoring purposes, given that it requires close contact between healthcare providers and patients. The longitudinal monitoring and early pre-screening of individuals suspicious of COVID-19 could be improved substantially with new non-invasive and easy-to-implement approaches that can be carried out efficiently at a low-cost by patients themselves without professional help. Given the difficulties and bottlenecks experienced so far around the world with the implementation of widespread testing, the ideal test procedure would, while maintaining a high level of accuracy (sensitivity and specificity), (a) allow patients to self-assess without the need for physical contact with healthcare professionals, (b) bring down the cost per test substantially (ideally close to zero), (c) eliminate the dependency of diagnostic kits on scarce materials, manufacturing capacity, and supply chain bottlenecks, and (d) be rapidly deployable around. The benefits of a digital COVID-19 test are significant enough to merit its pursuit

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.