Performance changes due to differences among annotating radiologists for training data in computerized lesion detection.

Yukihiro Nomura,Shouhei Hanaoka,Naoto Hayashi,Takeharu Yoshikawa,Saori Koshino,Chiaki Sato,Momoko Tatsuta,Yuya Tanaka,Shintaro Kano,Moto Nakaya,Shohei Inui,Masashi Kusakabe,Takahiro Nakao,Soichiro Miki,Takeyuki Watadani,Ryusuke Nakaoka,Akinobu Shimizu,Osamu Abe

doi:10.1007/s11548-024-03136-9

Abstract

The quality and bias of annotations by annotators (e.g., radiologists) affect the performance changes in computer-aided detection (CAD) software using machine learning. We hypothesized that the difference in the years of experience in image interpretation among radiologists contributes to annotation variability. In this study, we focused on how the performance of CAD software changes with retraining by incorporating cases annotated by radiologists with varying experience. We used two types of CAD software for lung nodule detection in chest computed tomography images and cerebral aneurysm detection in magnetic resonance angiography images. Twelve radiologists with different years of experience independently annotated the lesions, and the performance changes were investigated by repeating the retraining of the CAD software twice, with the addition of cases annotated by each radiologist. Additionally, we investigated the effects of retraining using integrated annotations from multiple radiologists. The performance of the CAD software after retraining differed among annotating radiologists. In some cases, the performance was degraded compared to that of the initial software. Retraining using integrated annotations showed different performance trends depending on the target CAD software, notably in cerebral aneurysm detection, where the performance decreased compared to using annotations from a single radiologist. Although the performance of the CAD software after retraining varied among the annotating radiologists, no direct correlation with their experience was found. The performance trends differed according to the type of CAD software used when integrated annotations from multiple radiologists were used.

Full Text