PurposeTo conduct a head-to-head comparison between deep learning (DL) and radiomics models across institutions for predicting microvascular invasion (MVI) in hepatocellular carcinoma (HCC) and to investigate the model robustness and generalizability through rigorous internal and external validation.MethodsThis retrospective study included 2304 preoperative images of 576 HCC lesions from two centers, with MVI status determined by postoperative histopathology. We developed DL and radiomics models for predicting the presence of MVI using B-mode ultrasound, contrast-enhanced ultrasound (CEUS) at the arterial, portal, and delayed phases, and a combined modality (B + CEUS). For radiomics, we constructed models with enlarged vs. original regions of interest (ROIs). A cross-validation approach was performed by training models on one center’s dataset and validating the other, and vice versa. This allowed assessment of the validity of different ultrasound modalities and the cross-center robustness of the models. The optimal model combined with alpha-fetoprotein (AFP) was also validated. The head-to-head comparison was based on the area under the receiver operating characteristic curve (AUC).ResultsThirteen DL models and 25 radiomics models using different ultrasound modalities were constructed and compared. B + CEUS was the optimal modality for both DL and radiomics models. The DL model achieved AUCs of 0.802–0.818 internally and 0.667–0.688 externally across the two centers, whereas radiomics achieved AUCs of 0.749–0.869 internally and 0.646–0.697 externally. The radiomics models showed overall improvement with enlarged ROIs (P < 0.05 for both CEUS and B + CEUS modalities). The DL models showed good cross-institutional robustness (P > 0.05 for all modalities, 1.6–2.1% differences in AUC for the optimal modality), whereas the radiomics models had relatively limited robustness across the two centers (12% drop-off in AUC for the optimal modality). Adding AFP improved the DL models (P < 0.05 externally) and well maintained the robustness, but did not benefit the radiomics model (P > 0.05).ConclusionCross-institutional validation indicated that DL demonstrated better robustness than radiomics for preoperative MVI prediction in patients with HCC, representing a promising solution to non-standardized ultrasound examination procedures.
Read full abstract