Various factors are underlying for the onset of non-B, non-C hepatitis virus-related hepatocellular carcinoma (NBNC-HCC). We aimed to investigate the independent risk factors and profiles associated with NBNC-HCC using a data-mining technique. We conducted a case-control study and enrolled 223 NBNC-HCC patients and 669 controls from a health checkup database (n = 176 886). Multivariate analysis, random forest analysis and a decision-tree algorithm were employed to examine the independent risk factors, factors distinguishing between the case and control groups, and to identify profiles for the incidence of NBNC-HCC, respectively. In multivariate analysis, besides γ-glutamyltransferase (GGT) levels and the Brinkman index, albumin level was an independent negative risk factor for the incidence of NBNC-HCC (odds ratio = 0.67; 95% confidence interval = 0.60-0.70; P < 0.0001). In random forest analysis, serum albumin level was the highest-ranked variable for distinguishing between the case and control groups (98 variable importance). A decision-tree algorithm was created for albumin and GGT levels, the aspartate aminotransferase-to-platelet ratio index (APRI) and the Brinkman index. The serum albumin level was selected as the initial split variable, and 82.5% of the subjects with albumin levels of less than 4.01 g/dL were found to have NBNC-HCC. Data-mining analysis revealed that serum albumin level is an independent risk factor and the most distinguishable factor associated with the incidence of NBNC-HCC. Furthermore, we created an NBNC-HCC profile consisting of albumin and GGT levels, the APRI and the Brinkman index. This profile could be used in the screening strategy for NBNC-HCC.
Read full abstract