Kidney Tumor Classification on CT images using Self-supervised Learning

Erdal Özbay,Feyza Altunbey Özbay,Farhad Soleimanian Gharehchopogh

doi:10.1016/j.compbiomed.2024.108554

Abstract

One of the most common diseases affecting society around the world is kidney tumor. The risk of kidney disease increases due to reasons such as consumption of ready-made food and bad habits. Early diagnosis of kidney tumors is essential for effective treatment, reducing side effects, and reducing the number of deaths. With the development of computer-aided diagnostic methods, the need for accurate renal tumor classification is also increasing. Because traditional methods based on manual detection are time-consuming, boring, and costly, high-accuracy tests can be performed faster and at a lower cost with deep learning (DL) methods in kidney tumor detection (KTD). Among the current challenges regarding artificial intelligence-assisted KTD, obtaining more precise programming information and the capacity to group with high accuracy make clinical determination more vital and bring it to an important point for current treatment in KTD prediction. This encourages us to propose a more effective DL model that can effectively assist specialist physicians in the diagnosis of kidney tumors. In this way, the workload of radiologists can be alleviated and errors in clinical diagnoses that may occur due to the complex structure of the kidney can be prevented. A large amount of data is needed during the training of the developed methods. Although various studies have been conducted to reduce the amount of data with feature selection techniques, these techniques provide little improvement in the classification accuracy rate. In this paper, a masked autoencoder (MAE) is proposed for KTD, which can produce effective results on datasets containing some samples and can be directly fine-tuned and pre-trained. Self-supervised learning (SSL) is achieved through self-distillation (SD), which can be reintroduced into the configuration loss calculation using masked patches. The SD loss on the decoder and encoder outputs’ latent representation is calculated operating SSLSD-KTD. The encoder obtains local attention, while the decoder transfers its global attention to calculate losses. The SSLSD-KTD method reached 98.04 % classification accuracy on the KAUH-kidney dataset, including 8400 samples, and 82.14 % on the CT-kidney dataset, containing 840 samples. By adding more external information to the SSLSD-KTD method with transfer learning, accuracy results of 99.82 % and 95.24 % were obtained on the same datasets. Experimental results have shown that the SSLSD-KTD method can effectively extract kidney tumor features with limited data and can be an aid or even an alternative for radiologists in decision-making in the diagnosis of the disease.

Full Text