e13594 Background: Patients with hepatocellular carcinoma (HCC) always require routine surveillance and repeated treatment. Our previous work (1) developed a machine learning methodology called “Survival Path Mapping”, which converted the time series data into a cascading survival map to facilitate dynamic prognosis prediction and treatment planning. However, in previous work, the exponential growth of paths raised challenges of exhausting limited cohort and over-fitting risk. Novel algorithm enhancing data utilization in the Survival Path Mapping was in urgent need. Methods: A node fusion algorithm was designed. First, feature selection algorithm bifurcated each parent-nodes into temporary child-nodes, which constitutes temporary child-nodes set. Then, clustering was performed within temporary child-nodes set according to key features of each child-node. Finally, in each cluster, multiple child-nodes would be merged into one if there was no significant difference in survival. For nodes with bifurcation, a treatment recommendation algorithm was developed. A total of 19446 patients with newly diagnosed primary HCC at Sun Yat-sen University Cancer Center (SYSUCC) between 2008 and 2023 were retrospectively included as internal cohort (divided into internal train and holdout validation cohort); meanwhile 1135 patients from five hospitals were included as multicenter external validation cohort. Dedicated natural language processing program was developed to extract structured features from medical imaging reports. The time series data of patients were converted into data of time slices, with an interval of 3 months. Results: The original survival path mapping (raw-SP) consists of 463 different nodes, while the node fusion survival path mapping (fusion-SP) consisted of 84 nodes (81.8% decrease from raw-SP). The 3-year c-index of fusion-SP were 0.745 (95% CI 0.717-0.772), significantly better (p<0.001) than that of raw-SP: 0.725 (95% CI 0.696-0.755), in the internal holdout validation. Both raw-SP and fusion-SP significantly (p < 0.001) outperformed conventional BCLC and CNLC staging. In the multicenter external validation, the fusion-SP also demonstrated better performance than raw-SP and BCLC staging system. The recommended therapy calculated based fusion-SP system for patients receiving second-line or subsequent treatment had superior or non-inferior efficacy compared to BCLC guideline’s recommendation. Conclusions: The proposed node fusion algorithm solved exponential growth of paths in previous survival path algorithm, improving the accuracy and external validity of dynamic prognosis prediction for HCC patients. The fusion-SP model also retained its transparency and could be used to guide dynamic treatment planning, with its effectiveness in treatment guidance evaluated in prospective trials in the future. 1. Nat Commun 9, 2230; 2018.
Read full abstract