Abstract
Hepatocellular carcinoma (HCC) is often diagnosed using gadoxetate disodium-enhanced magnetic resonance imaging (EOB-MRI). Standardized reporting according to the Liver Imaging Reporting and Data System (LI-RADS) can improve Gd-MRI interpretation but is rather complex and time-consuming. These limitations could potentially be alleviated using recent deep learning-based segmentation and classification methods such as nnU-Net. The study aims to create and evaluate an automatic segmentation model for HCC risk assessment, according to LI-RADS v2018 using nnU-Net. For this single-center retrospective study, 602 patients at risk for HCC were included, who had dynamic EOB-MRI examinations between 05/2005 and 09/2022, containing ≥ LR-3 lesion(s). Manual lesion segmentations in semantic segmentation masks as LR-3, LR-4, LR-5 or LR-M served as ground truth. A set of U-Net models with 14 input channels was trained using the nnU-Net framework for automatic segmentation. Lesion detection, LI-RADS classification, and instance segmentation metrics were calculated by post-processing the semantic segmentation outputs of the final model ensemble. For the external evaluation, a modified version of the LiverHccSeg dataset was used. The final training/internal test/external test cohorts included 383/219/16 patients. In the three cohorts, LI-RADS lesions (≥ LR-3 and LR-M) ≥ 10mm were detected with sensitivities of 0.41-0.85/0.40-0.90/0.83 (LR-5: 0.85/0.90/0.83) and positive predictive values of 0.70-0.94/0.67-0.88/0.90 (LR-5: 0.94/0.88/0.90). F1 scores for LI-RADS classification of detected lesions ranged between 0.48-0.69/0.47-0.74/0.84 (LR-5: 0.69/0.74/0.84). Median per lesion Sørensen-Dice coefficients were between 0.61-0.74/0.52-0.77/0.84 (LR-5: 0.74/0.77/0.84). Deep learning-based HCC risk assessment according to LI-RADS can be implemented as automatically generated tumor risk maps using out-of-the-box image segmentation tools with high detection performance for LR-5 lesions. Before translation into clinical practice, further improvements in automatic LI-RADS classification, for example through large multi-center studies, would be desirable.
Published Version
Join us for a 30 min session where you can share your feedback and ask us any queries you have