Abstract
The sound of cough is an important indicator of the condition of the respiratory system. Automatic cough sound evaluation can aid the diagnosis of respiratory diseases. Large crowdsourced cough sound datasets have recently been used by several groups around the world to develop cough classification models. However, not all recordings in these datasets contain cough sounds. As such, it is important to screen the recordings for the presence of cough sounds before developing cough classification models. This work proposes a method to screen crowdsourced audio recordings for cough sounds using deep learning methods. The proposed approach divides the audio recording into overlapping frames and converts each frame into a mel-spectrogram representation. A pretrained convolutional neural network for audio classification is trained to learn the spectral characteristics of cough and non-cough frames from its mel-spectrogram representation. It is combined with a recurrent neural network to learn the dependencies between the sequence of frames. The proposed method is evaluated on 400 crowdsourced audio recordings, manually annotated as cough or non-cough. An accuracy of 0.9800 (AUC of 0.9973) is achieved in classifying cough and non-cough recordings using the proposed method. The trained network is used to analyze the remaining audio recordings in the dataset, identifying only about 67% of recordings as containing usable cough sounds. This shows the need to exercise caution when using crowdsourced cough data.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have