Classification for thyroid nodule using ViT with contrastive learning in ultrasound images

Jiawei Sun,Bobo Wu,Tong Zhao,Liugang Gao,Kai Xie,Tao Lin,Jianfeng Sui,Xiaoqin Li,Xiaojin Wu,Xinye Ni

doi:10.1016/j.compbiomed.2022.106444

Abstract

The lack of representative features between benign nodules, especially level 3 of Thyroid Imaging Reporting and Data System (TI-RADS), and malignant nodules limits diagnostic accuracy, leading to inconsistent interpretation, overdiagnosis, and unnecessary biopsies. We propose a Vision-Transformer-based (ViT) thyroid nodule classification model using contrast learning, called TC-ViT, to improve accuracy of diagnosis and specificity of biopsy recommendations. ViT can explore the global features of thyroid nodules well. Nodule images are used as ROI to enhance the local features of the ViT. Contrast learning can minimize the representation distance between nodules of the same category, enhance the representation consistency of global and local features, and achieve accurate diagnosis of TI-RADS 3 or malignant nodules. The test results achieve an accuracy of 86.9%. The evaluation metrics show that the network outperforms other classical deep learning-based networks in terms of classification performance. TC-ViT can achieve automatic classification of TI-RADS 3 and malignant nodules on ultrasound images. It can also be used as a key step in computer-aided diagnosis for comprehensive analysis and accurate diagnosis. The code will be available at https://github.com/Jiawei217/TC-ViT.

Full Text