Abstract
The ACR Thyroid Imaging, Reporting, and Data System (TI-RADS) uses a score based on ultrasound (US) imaging to stratify the risk of nodule malignancy and recommend appropriate follow-up. This study aims to analyze US reports and explore how Natural Language Processing (NLP) leveraging Transformers models can classify ACR TI-RADS from text reports using the description of thyroid nodule features. This retrospective study evaluated 16,847 thyroid-free text reports from our institution. An automated system, followed by manual review by a radiologist, established baseline annotations by assigning ACR TI-RADS categories from 1 to 5. Two types of systems were evaluated and compared in the dataset. The first by performing a multiclass classification to detect the associated ACR TI-RADS, and the second by extracting thyroid nodule features from the textual reports and incorporating them into the classifier. Our study showed that models enhanced with specific features systematically outperformed those without. Particularly, the BERTIN model, to which additional features were added, achieved the highest level of accuracy, with a score of 0.8426. Moreover, we found a correlation between the presence of punctate echogenic foci, a feature often linked to malignant thyroid lesions, and increased ACR TI-RADS scores. The features of the thyroid nodules described in thyroid US reports, such as composition, echogenicity, shape, margin or echogenic foci, help the NLP classifier to predict the associated ACR TI-RADS most accurately.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have