Abstract

Background: Single-center trials have demonstrated an acceptable level of inter-observer agreement in the evaluation of the ultrasound (US) features of thyroid nodules, but limited data are available about the consistency in the assessment of US findings among different thyroid centers. Aim of the study: to assess the inter-observer agreement between different thyroid centers and among different specialists in the evaluation of the main US features of thyroid nodules. Materials and methods A blinded retrospective analysis of 100 electronically-recorded US images was conducted in three large-volume thyroid centers by seven qualified thyroid imaging experts, two radiologist and five endocrinologists. The following US features were evaluated: composition (solid, predominantly solid, predominantly cystic, and cystic); echogenicity (hyperechoic, isoechoic, mildly and deeply hypoechoic); margins (well-defined, ill-defined, microlobulated, and spiculated); calcifications (absent, microscopic, macroscopic, eggshell); hyperechoic foci of uncertain significance; comet-tail artifacts; vascularity (no vascular signals, perinodular and/or slight intranodular flow, and marked intranodular flow). Thyroid nodules were also classified according to four major US classification systems: AACE/ACE/AME, EU-TIRADS, ATA and ACR. The inter-observer agreement was calculated using cross-tabulation expressed in Cohen's Kappa. Kappa values were evaluated, according to Landis and Koch, as follows: 0-0.20 poor, 0.21-0.40 fair, 0.41-0.60 moderate, 0.61-0.80 substantial, and 0.81-1.0 almost perfect agreement. A sub-analysis assessed how many times and how many operators evaluated as suspicious for malignancy the US features of each nodule. Results: The inter-observer agreement resulted in a K-correlation coefficient of 34.5%, 44.0%, 42.3% and 38.8% for the ATA, AACE/ACE/AME, ACR, and EU-TIRADS classification systems, respectively. The interobserver agreement for the main thyroid nodule US findings resulted as follows: composition 53.2%; echogenicity 46.9%; margins 33.2%; comet-tail 10.6%, microcalcifications 46.8%; macrocalcifications 37.7%; eggshell calcifications 64.9%; intranodular vascularity 45.9%. Conclusion: The level of agreement among different thyroid centers in the description of suspicious US features in thyroid nodules ranged from fair to moderate, with the lowest level of consistency for the characteristics of margins and the presence of comet-tails. A similar range of variability was demonstrated also for the four US classification systems. An universally accepted lexicon of thyroid US features and a dedicated training in thyroid US findings definition are needed to improve the inter-observer agreement and the predictive value of US classification systems in real world practice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call