Abstract

ABSTRACT Lung diseases often result in severe damage to the respiratory tract, and lead to a high risk of mortality within a short period of time. DL models based on ViT are considered to have promising advantages over CNN architectures in terms of computational efficiency, and accuracy when trained on large ImageNet datasets. In this study, we present a new DL approach based on the combination of CNN with ViT to improve the efficiency of pneumonia diagnosis using medical images. In the first stage, raw images are passed through a local filter to capture local relations on the inputs. The local filter block includes two convolutional layers with kernel 3 × 3. This local filtering method aims to enhance rich features before being fed into the patching layer of the ViT block. The proposed method is experimented on the benchmark chest X-ray dataset. The proposed method is evaluated and compared to some well-known models, which include ViT, VGG19, Resnet50, Densnet201. Experimental results demonstrated that the proposed approach based on CNN and ViT reaches higher efficiency with about 1% accuracy to the standard ViT model, and about 2% higher with VGG19, Resnet50, Densnet201 and smaller in model architecture.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call