“You don't sound well, you should take the day off”: Automatic detection of upper respiratory tract infections from speech using time-frequency domain deep convolutional neural network

Pankaj Warule,Siba Prasad Mishra,Suman Deb,Jarek Krajewski

doi:10.1016/j.apacoust.2024.109980

Abstract

The acoustic-prosodic qualities of a speech signal are influenced by various health-related factors, owing to their complex and intricate nature. In the speech and health domain, machine learning research is active and expanding, with a focus on devising paradigms to objectively extract and measure such effects. The field of biomedical engineering has great promise in the development of non-invasive diagnostic procedures utilizing voice as a means of assessment. The utilization of speech signals for the purpose of screening for the upper respiratory tract infections (URTI) such as common cold may offer potential advantages in terms of mitigating its transmission. In this study, we have proposed a novel time-frequency domain deep convolutional neural network for URTI detection from speech using the Chirplet transform. The time-frequency representation of speech signal is achieved using the Chirplet transform. Then, a deep convolutional neural network is used for the classification of the time-frequency representation of healthy and URTI speech signals. The effectiveness of the proposed approach is assessed through the utilization of the URTIC database. We have achieved the UAR of 68.97% and 67.34% on the develop and test set of the URTIC database, respectively.

Full Text