Abstract
Brain tumor classification from Magnetic Resonance Imaging (MRI) images is an important task in medical imaging for the determination of appropriate treatment strategies and improvement in patient outcomes. Brain tumors, including gliomas, meningiomas, and glioblastomas, are of the most lethal forms of cancer. This research explored the potential of replacing convolutional neural networks (CNNs) by Vision Transformers (ViTs) on classifying brain tumors by MRI images. The paper focused on pretrained model Vit-B16, comparing it with traditional CNNs, including VGG16, ResNet-50, and EfficientNet-B0. ViT-B16, pretrained on ImageNet-21k, achieves improved accuracy, precision, recall, and F1-Score after being fine-tuned on the brain tumor dataset when applying data augmentation. The self-attention mechanism helps ViTs capture long-range dependencies and global context from the images, significantly improving the performance. As shown in the results, ViTs can efficiently handle complex dataset and become a useful tool for the area of medical imaging classification. This paper emphasizes the potential of Vision transformers in improving classification accuracy in the diagnosis of brain tumors. Future work can be focused on exploring better ViT architectures and data augmentation techniques.
Published Version
Join us for a 30 min session where you can share your feedback and ask us any queries you have