PurposeTuberculosis (TB) is a widespread infectious disease that requires early detection for effective treatment and control. This study aims to improve TB detection using cough audio analysis, comparing the performance of capsule networks to other deep learning models.MethodsWe used cough audio recordings from 1105 individuals with a new or worsening cough for at least two weeks, totaling 9772 recordings. These recordings were processed into spectral images, and HOG features were extracted. Various models, including Capsule Networks + FCNN, CNN, VGG16, and ResNet50 were trained and evaluated.ResultsCapsule Networks + FCNN achieved the best performance with an accuracy of 0.97, sensitivity of 0.98, specificity of 0.96, F1 score of 0.97, and precision of 0.97, outperforming other models. This attribute is due to the model’s ability to learn complex features from spectral images.ConclusionsThis study concludes that Capsule Networks are more effective than typical CNN-based models in diagnosing TB from cough audio. This suggests that advanced deep learning frameworks could significantly enhance TB screening accuracy, especially in resource-limited areas.
Read full abstract