Kidney CT Scan Image Classification Using Modified Vision Transformer

Roshan Subedi,Suresh Timilsina,Smita Adhikari

doi:10.3126/jes2.v2i1.60381

Roshan Subedi, Suresh Timilsina + Show 1 more

Open Access

https://doi.org/10.3126/jes2.v2i1.60381

Copy DOI

Abstract

With the rising number of kidney-related health issues, early and precise diagnosis is crucial. The study aims to create a reliable method for categorizing kidney CT scan images into four groups: Cyst, Normal, Tumor, and stone. Traditional approaches usually rely on typical Machine Learning (ML) and Convolution Neural Networks (CNNs). However, in this research, the potential of a novel model called Vision Transformer (ViT) is explored. ViT was initially designed for Natural Language Processing (NLP) tasks but shows promise for medical image classification. ViT’s capabilities are enhanced by coupling it with Fully Connected Networks (FCN). This combination helps to merge the feature extraction capability of the ViT and the classification ability of the FCN, which ultimately helps to overcome the challenge of detecting kidney-related issues.

Full Text