Abstract Background. In hormone-receptor positive (HoR)/HER2 negative early breast cancer (BC) multiple efforts have been made to predict disease recurrence and survival. Machine learning techniques have been used but few studies have looked into their applicability in predicting survival on the basis of clinico-pathological characteristics. The aim of this study was to evaluate a random survival forest model to predict the prognosis of this specific BC subtype. Methods. In this multicenter, retrospective study, patients who fulfilled the following inclusion criteria were included: (1) diagnosis of pathologically confirmed HoR-positive/HER2-negative invasive BC; (2) early or locally advanced, stage I-II-III (3) patients receiving neoadjuvant anthracycline and/or taxane-based chemotherapy, concurrently or sequentially; (4) patients undergoing surgery for primary BC. Survival endpoints were disease-free survival (DFS) and overall survival (OS). A random survival forest algorithm was used to develop the predictive model. Ten-fold cross-validation was performed. The C-index and the continuous rank probability score (CRPS) were used to evaluate the discrimination of the predictive model, and a ROC curve was used to evaluate model precision. A cut point analysis based on maximally selected rank statistics was conducted to evaluate the best cut-off in the out-of-bag (OOB) mortality that could maximize DFS and OS prediction. Variable importance was assessed using Breiman-Cutler permutation importance. Results. Overall, 572 patients with HoR-positive/HER2-negative early BC were included. At univariate analysis age, T stage, N stage, grading, ER, PR and Ki67 were found to be significantly associated with DFS and ER, PgR, HER2, pCR ypT and ypN retained statistical significance at multivariate analysis. ER, pCR, pathological T and N stage were found to be significantly associated with OS at univariate and multivariate analysis. The following variables were included in the final model: menopausal status, age, histology, grade, clinical T/N, ER/PgR, Ki67 and HER2, pathological complete response (pCR), ypT and ypN. For DFS, the cross-validated C-index was 0.68 (95% CI 0.63-0.73) and the OOB CRPS was 0.15, with a OOB performance error of 0.31. The AUC calculated at 60 months was 0.91 (95% CI 0.88-0.94). Out-of bag RSF-based risk scores for individual patients were calculated and an optimal cut-off of 22.58 was identified. The HR between high risk and low risk group was 3.08 (95% CI 2.16-4.39, p< 0.001). For OS, the cross-validated C-index was 0.66 (95% CI 0.60-0.71) and the OOB CRPS was 0.17, with a OOB performance error of 0.35. The AUC calculated at 60 months 0.91 (95% CI 0.88-0.95). An optimal cut-off of 21.54 was identified and the HR between low risk and high risk group was 2.50 (95% CI 1.73-3.61, p< 0.001). For DFS, the most important variables were cN (9.00), ypN (7.35), ER (7.21), ypT (5.38), PgR (4.11), pCR (4.01) and age (2.82) while for OS the most important variables were cN (7.88), ER (5.37), PgR (3.61), age (3.37), pCR (2.80), ypT (2.64) and ypN (2.60). Discussion. We analyzed the performance of RSF in the prediction of DFS and OS based on the contribution of clinico-pathological features commonly available at the baseline of the NCT and post-surgery. By selecting patients with HoR-positive HER2-negative disease, we were able to show which clinico-pathological features within a ML model have greater predictive importance. The model could be integrated with image-based tumor and radiology features to improve its predictive accuracy. This retrospective multicenter study suggests that the combination of easily accessible clinico-pathological features within a ML model may reliably predict DFS and OS in the context of HoR-positive/HER2-negative BCs. Citation Format: Luca Mastrantoni, Giovanna Garufi, Noemi Maliziola, Elena Di Monte, Giorgia Arcuri, Valentina Frescura, Angelachiara Rotondi, Giulia Giordano, Luisa Carbognin, Alessandra Fabi, Ida Paris, Gianluca Franceschini, Armando Orlandi, Antonella Palazzo, Giovanni Scambia, Giampaolo Tortora, Emilio Bria. Development of Artificial Intelligence-based Machine Learning Models for Predicting Survival In Hormone-Receptor-Positive/HER2-Negative Early Breast Cancer undergoing Neoadjuvant Chemotherapy [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO2-01-14.
Read full abstract