Thyroid cancer, the only cancer that uses age as a specific predictor of survival, is increasing in incidence, yet it has a low mortality rate, which can lead to overdiagnosis and overtreatment. We developed an age-stratified deep learning (DL) model (hereafter, ASMCNet) for classifying thyroid nodules and aimed to investigate the effect of age stratification on the accuracy of a DL model, exploring how ASMCNet can help radiologists improve diagnostic performance and avoid unnecessary biopsies. In this retrospective study, we used ultrasound images from three hospitals, a total of 10,391 images of 5934 patients were used for training, validation, and testing. The performance of ASMCNet was compared with that of model-trained non-age-stratified radiologists with different experience levels on the test data set with the DeLong method. The area under the receiver operating characteristic curve (AUROC), sensitivity, and specificity of ASMCNet were 0.906, 86.1%, and 85.1%, respectively, which exceeded those of model-trained non-age-stratified (0.867, 83.2%, and 75.5%, respectively; p < 0.001) and higher than all of the radiologists (p < 0.001). Reader studies show that radiologists' performances are improved when assisted by the explaining heatmaps (p < 0.001). Our study demonstrates that age stratification based on DL can further improve the performance of thyroid tumor classification models, which also suggests that age is an important factor in the diagnosis of thyroid tumors. The ASMCNet model shows promising clinical applicability and can assist radiologists in improving diagnostic accuracy. Question Age is crucial for differentiated thyroid carcinoma (DTC) prognosis, yet its diagnostic impact lacks research. Findings Adding age stratification to DL models can further improve the accuracy of thyroid nodule diagnosis. Clinical relevance Age-stratified multimodal classification network is a reliable tool used to help radiologists diagnose thyroid nodules, and integrating it into clinical practice can improve diagnostic accuracy and reduce unnecessary biopsies or treatments.
Read full abstract