Abstract

Background US-based diagnosis of thyroid nodules is subjective and influenced by radiologists' experience levels. Purpose To develop an artificial intelligence model based on American College of Radiology Thyroid Imaging Reporting and Data System characteristics for diagnosing thyroid nodules and identifying nodule characteristics (hereafter, MTI-RADS) and to compare the performance of MTI-RADS, radiologists, and a model trained on benign and malignant status based on surgical histopathologic analysis (hereafter, MDiag). Materials and Methods In this retrospective study, 1588 surgically proven nodules from 636 consecutive patients (mean age, 49 years ± 14 [SD]; 485 women) were included. MTI-RADS and MDiag were trained on US images of 1345 nodules (January 2018 to December 2019). The performance of MTI-RADS was compared with that of MDiag and radiologists with different experience levels on the test data set (243 nodules, January 2019 to December 2019) with the DeLong method and McNemar test. Results The area under the receiver operating characteristic curve (AUC) and sensitivity of MTI-RADS were 0.91 and 83% (55 of 66 nodules), respectively, which were not significantly different from those of experienced radiologists (0.93 [P = .45] and 92% [61 of 66 nodules; P = .07]) and exceeded those of junior radiologists (0.78 [P < .001] and 70% [46 of 66 nodules; P = .04]). The specificity of MTI-RADS (87% [154 of 177 nodules]) was higher than that of both experienced and junior radiologists (80% [141 of 177 nodules; P = .02] and 75% [133 of 177 nodules; P = .001], respectively). The AUC of MTI-RADS was higher than that of MDiag (0.91 vs 0.84, respectively; P = .001). In the test set of 243 nodules, the consistency rates between MTI-RADS and the experienced group were higher than those between MTI-RADS and the junior group for composition (79% [n = 193] vs 73% [n = 178], respectively; P = .02), echogenicity (75% [n = 183] vs 68% [n = 166]; P = .04), shape (93% [n = 227] vs 88% [n = 215]; P = .04), and smooth or ill-defined margin (72% [n = 174] vs 63% [n = 152]; P = .002). Conclusion The area under the receiver operating characteristic curve (AUC) of an artificial intelligence model based on the American College of Radiology Thyroid Imaging Reporting and Data System (TI-RADS) was higher than that of a model trained on benign and malignant status based on surgical histopathologic analysis. The AUC and sensitivity of the model based on TI-RADS exceeded those of junior radiologists; the specificity of the model was higher than that of both experienced and junior radiologists. © RSNA, 2022.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.