SRT: Swin-residual transformer for benign and malignant nodules classification in thyroid ultrasound images

Long Huang,Yanran Xu,Shuhuan Wang,Liang Sang,He Ma

doi:10.1016/j.medengphy.2024.104101

Abstract

With the advancement of deep learning technology, computer-aided diagnosis (CAD) is playing an increasing role in the field of medical diagnosis. In particular, the emergence of Transformer-based models has led to a wider application of computer vision technology in the field of medical image processing. In the diagnosis of thyroid diseases, the diagnosis of benign and malignant thyroid nodules based on the TI-RADS classification is greatly influenced by the subjective judgment of ultrasonographers, and at the same time, it also brings an extremely heavy workload to ultrasonographers. To address this, we propose Swin-Residual Transformer (SRT) in this paper, which incorporates residual blocks and triplet loss into Swin Transformer (SwinT). It improves the sensitivity to global and localized features of thyroid nodules and better distinguishes small feature differences. In our exploratory experiments, SRT model achieves an accuracy of 0.8832 with an AUC of 0.8660, outperforming state-of-the-art convolutional neural network (CNN) and Transformer models. Also, ablation experiments have demonstrated the improved performance in the thyroid nodule classification task after introducing residual blocks and triple loss. These results validate the potential of the proposed SRT model to improve the diagnosis of thyroid nodules' ultrasound images. It also provides a feasible guarantee to avoid excessive puncture sampling of thyroid nodules in future clinical diagnosis.

Full Text