Abstract

In this research, we developed a two-stage deep learning (DL) model using Vision Transformer (ViT) to detect COVID-19 and assess its severity from thoracic CT images. In the first stage, we utilized a pre-trained ViT model (ViT_B/32) and a custom CNN model to classify CT images as COVID-19 or non-COVID-19. The ViT model achieved superior performance with a fivefold cross-validated accuracy of 99.7%, compared to the custom CNN’s 98%. In the second stage, we employed a ViT-based U-Net model (Vision Transformer for Biomedical Image Segmentation, VITBIS) to segment lung and infection regions in COVID-19 positive CT images, determining the infection severity. This model uses transformers with attention mechanisms in both the encoder and decoder. The lung segmentation network achieved an Intersection Over Union (IOU) of 95.8% and a sensitivity of 99.67%, while the lesion segmentation network attained an IOU of 94% and a sensitivity of 98.3%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call