An ensemble learning approach to digital corona virus preliminary screening from cough sounds

Emad A Mohammed,Amir Sanati-Nezhad,Mohammad Keyhani,S Hossein Hejazi,Behrouz H Far

doi:10.1038/s41598-021-95042-2

Emad A Mohammed, Amir Sanati-Nezhad + Show 3 more

Open Access

PDF Available

https://doi.org/10.1038/s41598-021-95042-2

Copy DOI

Export

Save

Cite

Journal: Scientific Reports	Publication Date: Jul 28, 2021
Citations: 55	License type: open-access

Affiliation: University of Calgary

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

This work develops a robust classifier for a COVID-19 pre-screening model from crowdsourced cough sound data. The crowdsourced cough recordings contain a variable number of coughs, with some input sound files more informative than the others. Accurate detection of COVID-19 from the sound datasets requires overcoming two main challenges (i) the variable number of coughs in each recording and (ii) the low number of COVID-positive cases compared to healthy coughs in the data. We use two open datasets of crowdsourced cough recordings and segment each cough recording into non-overlapping coughs. The segmentation enriches the original data without oversampling by splitting the original cough sound files into non-overlapping segments. Splitting the sound files enables us to increase the samples of the minority class (COVID-19) without changing the feature distribution of the COVID-19 samples resulted from applying oversampling techniques. Each cough sound segment is transformed into six image representations for further analyses. We conduct extensive experiments with shallow machine learning, Convolutional Neural Network (CNN), and pre-trained CNN models. The results of our models were compared to other recently published papers that apply machine learning to cough sound data for COVID-19 detection. Our method demonstrated a high performance using an ensemble model on the testing dataset with area under receiver operating characteristics curve = 0.77, precision = 0.80, recall = 0.71, F1 measure = 0.75, and Kappa = 0.53. The results show an improvement in the prediction accuracy of our COVID-19 pre-screening model compared to the other models.

Highlights

This work develops a robust classifier for a COVID-19 pre-screening model from crowdsourced cough sound data
The results showed that the Convolutional Neural Network (CNN) model achieved COVID-19 sensitivity of 98.5% with a specificity of 94.2% (AUC: 0.97)
The main challenge faced in this work is how to utilize a crowdsourced cough dataset with diverse length, pacing, number of coughs, and stochastic background noise from publicly available COVID-19 cough sounds

Summary

Introduction

This work develops a robust classifier for a COVID-19 pre-screening model from crowdsourced cough sound data. The results of our models were compared to other recently published papers that apply machine learning to cough sound data for COVID-19 detection. Sample collection with the NP swab is an invasive method and is not ideal for screening, prognostics, and longitudinal monitoring purposes, given that it requires close contact between healthcare providers and patients. The longitudinal monitoring and early pre-screening of individuals suspicious of COVID-19 could be improved substantially with new non-invasive and easy-to-implement approaches that can be carried out efficiently at a low-cost by patients themselves without professional help. Given the difficulties and bottlenecks experienced so far around the world with the implementation of widespread testing, the ideal test procedure would, while maintaining a high level of accuracy (sensitivity and specificity), (a) allow patients to self-assess without the need for physical contact with healthcare professionals, (b) bring down the cost per test substantially (ideally close to zero), (c) eliminate the dependency of diagnostic kits on scarce materials, manufacturing capacity, and supply chain bottlenecks, and (d) be rapidly deployable around. The benefits of a digital COVID-19 test are significant enough to merit its pursuit

Objectives

Methods

Results

Conclusion