Respiratory diseases are one of the leading causes of death around the world and they severely affect patient quality of life. Auscultation is an essential method for diagnosing respiratory diseases, and it is low-cost and convenient. However, auscultation requires experts who are highly experienced. Medical trainees suffer from misdiagnosis inevitably. To address this issue, a novel machine learning model is proposed, which consists of upsampling convolutional neural network (CNN), a long short-term memory network (LSTM), and a fully connected network (FCNN) with embedding layers to classify respiratory sounds into seven categories: Normal (N), Rhonchi (R), Wheeze (W), Stridor (S), Coarse Crackle (CC), Fine Crackle (FC), Wheeze & Crackle (WC). The model is trained and evaluated on the SPRSound dataset and obtained the result on the test dataset with a sensitivity of 0.5716, specificity of 0.7882, average score of 0.6799, harmonic score of 0.6626, and total score of 0.6756.
Read full abstract