Background & AimsHepatocellular carcinoma (HCC) is characterized by a high mortality rate. The Liver Imaging Reporting and Data System (LI-RADS) results in considerable proportions of indeterminate observations, rendering an accurate diagnosis difficult. MethodsWe developed four deep learning models for diagnosing HCC on computed tomography (CT) via a training-validation-testing approach. Thin-slice triphasic CT liver images and relevant clinical information were collected and processed for deep learning. HCC was diagnosed and verified via a 12-month clinical composite reference standard. CT observations among at-risk patients were annotated using LI-RADS. Diagnostic performance was assessed by internal validation and independent external testing. We conducted sensitivity analyses of different subgroups, deep learning explainability evaluation, and misclassification analysis. ResultsFrom 2,832 patients and 4,305 CT observations, the best-performing model was Spatio-Temporal 3D Convolution Network (ST3DCN), achieving area under curves (AUCs) of 0.919 (95%CI 0.903-0.935) and 0.901 (95%CI 0.879-0.924) at the observation (n=1077) and patient (n=685) levels respectively during internal validation, compared to 0.839 (95%CI 0.814-0.864) and 0.822 (95%CI 0.790-0.853) respectively for standard-of-care radiological interpretation. ST3DCN’s negative predictive values were 0.966 (95%CI 0.954-0.979) and 0.951 (95%CI 0.931-0.971) respectively. ST3DCN’s observation-level AUCs among at-risk patients, 2-5 cm observations and singular porto-venous phase analysis were 0.899 (95%CI 0.874-0.924), 0.872 (95%CI 0.838-0.909) and 0.912 (95%CI 0.895-0.929) respectively. In external testing (551/717 patients/observations), ST3DCN’s AUC was 0.901 (95%CI 0.877-0.924), non-inferior to radiological interpretation (AUC 0.900, 95%CI 0.877-923). ConclusionsST3DCN achieved strong, robust performance for accurate HCC diagnosis on CT. Deep learning can expedite and improve the diagnostic process of HCC. Impact and implicationsThe clinical applicability of deep learning in HCC diagnosis is potentially huge, especially considering the expected increase in the incidence and mortality of HCC in Eastern Asia and worldwide. Early diagnosis through deep learning can lead to earlier definitive management, particularly for at-risk patients. The model can be broadly deployed for patients undergoing a triphasic contrast CT scan of the liver to reduce the current high mortality rate of HCC.
Read full abstract