Multimodal deep learning model for Covid-19 detection

Fadilul-Lah Yassaanah Issahaku,Xiangwei Liu,Ke Lu,Xianwen Fang,Sumaiya Bashiru Danwana,Ernest Asimeng

doi:10.1016/j.bspc.2023.105906

Abstract

This study introduces a novel multimodal deep learning model for detecting Covid-19, uniquely combining chest X-ray images and cough sound features. We incorporated an attention layer to effectively influence the inputs from both features to enhance the detection process. The integration of time and frequency domain cough features with chest X-ray images represents a significant advancement in Covid-19 detection methodologies, offering promising results and potential applications in the medical field. In the proposed model, we modified vgg16 and a faster region-based convolutional neural network (FRCNN). The modified vgg16 is used to extract the features from both datasets, whereas the modified FRCNN is used for detection. Utilizing the Vgg16 model for feature extraction and faster-rcnn for Covid-19 detection allows for a robust and efficient approach. We analyzed the performance of the proposed methodology using a dataset of 2059 Covid-19 positive chest X-ray images and a diverse collection of 25,000 cough recordings, including 716 positive, 7693 negative, and 1591 symptomatic cases. The datasets were divided into 80% for training and 20% for testing. The model achieves 99.80% accuracy, an f1-score of 99.70%, and 0.0132 validation loss. We further performed an extensive ablation experiment on the features. We compared the results of the proposed model to already existing networks using chest X-rays and coughs as datasets, and the proposed model outperformed them. Experiments also show that the trained model resists training data quantity and quality and can effectively detect Covid-19 in chest X-rays and coughs.

Full Text