Ultrasound grading of thyroid nodules using the BTA U-scoring guidelines - Is there evidence of intra-and interobserver variability?

Michael Couzins,Elizabeth E Rutherford,Indu Mitra,Stuart Forbes,Ganesh Vigneswaran

doi:10.1177/1742271x20971323

Michael Couzins, Elizabeth E Rutherford + Show 3 more

Open Access

https://doi.org/10.1177/1742271x20971323

Copy DOI

Abstract

U-score ultrasound classification (graded U1-U5) is widely used to grade thyroid nodules based on benign and malignant sonographic features. It is well established that ultrasound is an operator-dependent imaging modality and thus more susceptible to subjective variances between operators when using imaging-based scoring systems. We aimed to assess whether there is any intra- or interobserver variability when U-scoring thyroid nodules and whether previous thyroid ultrasound experience has an effect on this variability. A total of 14 ultrasound operators were identified (five experienced thyroid operators, five with intermediate experience and four with no experience) and were asked to U-score images from 20 thyroid cases shown as a single projection, with and without Doppler flow. The cases were subsequently rescored by the 14 operators after six weeks. The first and second round U-scores for the three operator groups were then analysed using Fleiss' kappa to assess interobserver variability and Cochran's Q test to determine any intraobserver variability. We found no significant interobserver variability on combined assessment of all operators with fair agreement in round 1 (Fleiss' kappa = 0.30, p <0.0001) and slight agreement in round 2 (Fleiss' kappa = 0.19, p < 0.0001). Cochran's Q test revealed no significant intraobserver variability in all 14 operators between round 1 and round 2 (all p>0.05). We found no statistically significant inter- or intraobserver variability in the U-scoring of thyroid nodules between all participants reinforcing the validity of this scoring method in clinical practice, allaying concerns regarding potential subjective biases in reporting.

Full Text