Abstract

BackgroundCritical thyroid nodule features are contained in unstructured ultrasound (US) reports. The Thyroid Imaging, Reporting, and Data System (TI-RADS) uses five key features to risk stratify nodules and recommend appropriate intervention. This study aims to analyze the quality of US reporting and the potential benefit of Natural Language Processing (NLP) systems in efficiently capturing TI-RADS features from text reports. Materials and MethodThis retrospective study used free-text thyroid US reports from an academic center (A) and community hospital (B). Physicians created “gold standard” annotations by manually extracting TI-RADS features and clinical recommendations from reports to determine how often they were included. Similar annotations were created using an automated NLP system and compared with the gold standard. ResultsTwo hundred eighty-two reports contained 409 nodules at least 1-cm in maximum diameter. The gold standard identified three nodules (0.7%) which contained enough information to calculate a complete TI-RADS score. Shape was described most often (92.7% of nodules), whereas margins were described least often (11%). A median number of two TI-RADS features are reported per nodule. The NLP system was significantly less accurate than the gold standard in capturing echogenicity (27.5%) and margins (58.9%). One hundred eight nodule reports (26.4%) included clinical management recommendations, which were included more often at site A than B (33.9 versus 17%, P < 0.05). ConclusionsThese results suggest a gap between current US reporting styles and those needed to implement TI-RADS and achieve NLP accuracy. Synoptic reporting should prompt more complete thyroid US reporting, improved recommendations for intervention, and better NLP performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call