Cross-site Validation of AI Segmentation and Harmonization in Breast MRI.

Yu Huang,Nicholas J Leotta,Lukas Hirsch,Roberto Lo Gullo,Mary Hughes,Jeffrey Reiner,Nicole B Saphier,Kelly S Myers,Babita Panigrahi,Emily Ambinder,Philip Di Carlo,Lars J Grimm,Dorothy Lowell,Sora Yoon,Sujata V Ghate,Lucas C Parra,Elizabeth J Sutton

doi:10.1007/s10278-024-01266-9

Yu Huang, Nicholas J Leotta + Show 15 more

https://doi.org/10.1007/s10278-024-01266-9

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

This work aims to perform a cross-site validation of automated segmentation for breast cancers in MRI and to compare the performance to radiologists. A three-dimensional (3D) U-Net was trained to segment cancers in dynamic contrast-enhanced axial MRIs using a large dataset from Site 1 (n = 15,266; 449 malignant and 14,817 benign). Performance was validated on site-specific test data from this and two additional sites, and common publicly available testing data. Four radiologists from each of the three clinical sites provided two-dimensional (2D) segmentations as ground truth. Segmentation performance did not differ between the network and radiologists on the test data from Sites 1 and 2 or the common public data (median Dice score Site 1, network 0.86 vs. radiologist 0.85, n = 114; Site 2, 0.91 vs. 0.91, n = 50; common: 0.93 vs. 0.90). For Site 3, an affine input layer was fine-tuned using segmentation labels, resulting in comparable performance between the network and radiologist (0.88 vs. 0.89, n = 42). Radiologist performance differed on the common test data, and the network numerically outperformed 11 of the 12 radiologists (median Dice: 0.85-0.94, n = 20). In conclusion, a deep network with a novel supervised harmonization technique matches radiologists' performance in MRI tumor segmentation across clinical sites. We make code and weights publicly available to promote reproducible AI in radiology.

Full Text