PurposeLiver imaging reporting and data system (LI-RADS) classification, especially the identification of LR-3 to 5 lesions with hepatocellular carcinoma (HCC) probability, is of great significance to treatment strategy determination. We aimed to develop a semi-automatic LI-RADS grading system on multiphase gadoxetic acid-enhanced MRI using deep convolutional neural networks (CNN).Patients and MethodsAn internal data set of 439 patients and external data set of 71 patients with suspected HCC were included and underwent gadoxetic acid-enhanced MRI. The expert-guided LI-RADS grading system consisted of four deep 3D CNN models including a tumor segmentation model for automatic diameter estimation and three classification models of LI-RADS major features including arterial phase hyper-enhancement (APHE), washout and enhancing capsule. An end-to-end learning system comprising single deep CNN model that directly classified the LI-RADS grade was developed for comparison.ResultsOn internal testing set, the segmentation model reached a mean dice of 0.84, with the accuracy of mapped diameter intervals as 82.7% (95% CI: 74.4%, 91.7%). The area under the curves (AUCs) were 0.941 (95% CI: 0.914, 0.961), 0.859 (95% CI: 0.823, 0.890) and 0.712 (95% CI: 0.668, 0.754) for APHE, washout and capsule, respectively. The expert-guided system significantly outperformed the end-to-end system with a LI-RADS grading accuracy of 68.3% (95% CI: 60.8%, 76.5%) vs 55.6% (95% CI: 48.8%, 63.0%) (P<0.0001). On external testing set, the accuracy of mapped diameter intervals was 91.5% (95% CI: 81.9%, 100.0%). The AUCs were 0.792 (95% CI: 0.745, 0.833), 0.654 (95% CI: 0.602, 0.703) and 0.658 (95% CI: 0.606, 0.707) for APHE, washout and capsule, respectively. The expert-guided system achieved an overall grading accuracy of 66.2% (95% CI: 58.0%, 75.2%), significantly higher than the end-to-end system of 50.1% (95% CI: 43.1%, 58.1%) (P<0.0001).ConclusionWe developed a semi-automatic step-by-step expert-guided LI-RADS grading system (LR-3 to 5), superior to the conventional end-to-end learning system. This deep learning-based system may improve workflow efficiency for HCC diagnosis in clinical practice.