Postsustained virologic response (SVR) screening following clinical guidelines does not address individual risk of hepatocellular carcinoma (HCC). Our aim is to provide tailored screening for patients using machine learning to predict HCC incidence after SVR. Using clinical data from 1,028 SVR patients, we developed an HCC prediction model using a random survival forest (RSF). Model performance was assessed using Harrel's c-index and validated in an independent cohort of 737 SVR patients. Shapley additive explanation (SHAP) facilitated feature quantification, whereas optimal cutoffs were determined using maximally selected rank statistics. We used Kaplan-Meier analysis to compare cumulative HCC incidence between risk groups. We achieved c-index scores and 95% CIs of 0.90 (0.85 to 0.94) and 0.80 (0.74 to 0.85) in the derivation and validation cohorts, respectively, in a model using platelet count, gamma-glutamyl transpeptidase, sex, age, and ALT. Stratification resulted in four risk groups: low, intermediate, high, and very high. The 5-year cumulative HCC incidence rates and 95% CIs for these groups were as follows: derivation: 0% (0 to 0), 3.8% (0.6 to 6.8), 26.2% (17.2 to 34.3), and 54.2% (20.2 to 73.7), respectively, and validation: 0.7% (0 to 1.6), 7.1% (2.7 to 11.3), 5.2% (0 to 10.8), and 28.6% (0 to 55.3), respectively. The integration of RSF and SHAP enabled accurate HCC risk classification after SVR, which may facilitate individualized HCC screening strategies and more cost-effective care.
Read full abstract