Background/IntroductionTo evaluate the performance of pre-trained deep learning schemes (DLS) in hepatic steatosis (HS) grading of Non-Alcoholic Fatty Liver Disease (NAFLD) patients, using as input B-mode US images containing right kidney (RK) cortex and liver parenchyma (LP) areas indicated by an expert radiologist. MethodsA total of 112 consecutively enrolled, biopsy-validated NAFLD patients underwent a regular abdominal B-mode US examination. For each patient, a radiologist obtained a B-mode US image containing RK cortex and LP and marked a point between the RK and LP, around which a window was automatically cropped. The cropped image dataset was augmented using up-sampling, and the augmented and non-augmented datasets were sorted by HS grade. Each dataset was split into training (70%) and testing (30 %), and fed separately as input to InceptionV3, MobileNetV2, ResNet50, DenseNet201, and NASNetMobile pre-trained DLS. A receiver operating characteristic (ROC) analysis of hepatorenal index (HRI) measurements by the radiologist from the same cropped images was used for comparison with the performance of the DLS. ResultsWith the test data, the DLS reached 89.15 %–93.75 % accuracy when comparing HS grades S0-S1vs.S2-S3 and 79.69 %–91.21 % accuracy for S0vs.S1vs.S2vs.S3 with augmentation, and 80.45–82.73 % accuracy when comparing S0-S1vs.S2-S3 and 59.54 %–63.64 % accuracy for S0vs.S1vs.S2vs.S3 without augmentation. The performance of radiologists’ HRI measurement after ROC analysis was 82 %, 91.56 %, and 96.19 % for thresholds of S ≥ S1, S ≥ S2, and S = S3, respectively. ConclusionAll networks achieved high performance in HS assessment. DenseNet201 with the use of augmented data seems to be the most efficient supplementary tool for NAFLD diagnosis and grading.
Read full abstract