Abstract Background: The Oncotype DX (ODX) test is a 21-gene expression assay widely used for the prediction of risk recurrence in early-stage breast cancer, but it may be possible to identify patients who can forgo testing using only clinicopathologic variables. In 2018, the National Cancer Database (NCDB) began reporting quantitative histologic parameters for estrogen receptor (ER), progesterone receptor (PR), and Ki-67 expression in breast cancer patients. Inclusion of these variables may improve the development of nationally applicable models to predict ODX results using clinicopathologic variables alone. Methods: Using a cohort of patients from the NCDB diagnosed from 2018–2020 with hormone receptor (HR)-positive, HER2-negative, Stage I-III breast cancer, we trained machine learning models to predict high-risk (26-100) ODX score. A subset comprising 80% of patients was used for model training, while the remaining data were set aside for internal validation. An external validation cohort was selected from the University of Chicago Medical Center (UCMC), including patients diagnosed from 2009–2021. Feature selection, model architecture selection, and hyperparameter tuning were performed using 10-fold cross-validation within the NCDB training set. We compared a model with quantitative ER, PR, and Ki-67; a model with only quantitative ER and PR, and a model without quantitative immunohistochemistry – to best reflect the likely data available in a variety of practice patterns. The primary endpoint was the area under the receiver operating characteristic curve (AUROC) for prediction of high-risk ODX results in the UCMC validation cohort. Models were also evaluated as rule-out tests to identify low-risk patients who did not require further ODX testing, using a high (90%) sensitivity threshold, fit in the NCDB training dataset. Results: We identified 53,346 patients from the NCDB cohort meeting the inclusion criteria; 7% had a high risk ODX score, with a median follow-up time of 28 months. The UCMC validation cohort included 896 patients, and was more diverse, with 30% non-Hispanic Black patients (versus 8% in NCDB), more high-risk patients (18% with high ODX), and a longer median follow-up time of 55 months. In the NCDB validation cohort, models incorporating quantitative ER/PR (AUROC 0.78, 95% CI 0.77–0.80) and quantitative ER/PR/Ki-67 (AUROC 0.81, 95% CI 0.80–0.83) both performed better than the non-quantitative model (AUROC 0.70, 95% CI 0.68–0.72). These results were preserved in the external UCMC cohort, where the ER/PR model (AUROC 0.86, 95% CI 0.80–0.92, p = 0.032) and the ER/PR/Ki-67 model (AUROC 0.87, 95% CI 0.81–0.93) outperformed the non-quantitative model (AUROC 0.80, 95% CI 0.73–0.87, p = 0.009). The high sensitivity rule-out threshold of the ER/PR model predicted that 30% of patients in the UCMC cohort would be low ODX, and the ER/PR/Ki-67 model predicted 44% as low risk – negative predictive value was over 96% for prediction of high ODX. Of the patients predicted to be low risk by the quantitative models, none had a documented high ODX score, and recurrence was < 3% at 5 years. The hazard ratio for recurrence free interval, adjusted for age and comorbidity score, of patients predicted to be high risk by this threshold was 2.96 (95% CI 1.02–8.58) for the ER/PR model and 3.84 (95% CI 1.48–9.97) for the ER/PR/Ki-67 model. Conclusions: We present externally validated and nationally applicable models that identify approximately half of HR-positive/HER2-negative breast cancer patients who are unlikely to have high ODX results using widely available quantitative clinicopathologic variables. Patients identified as low risk by these models have excellent long-term outcomes and may be able to forgo adjuvant chemotherapy without further genomic testing. Citation Format: Asim Dhungana, Augustin Vannier, Fangyuan Zhao, Jincong Freeman, Poornima Saha, Megan Sullivan, Katharine Yao, Elbio Flores, Olufunmilayo Olopade, Dezheng Huo, Alexander Pearson, Frederick Howard. Development and Validation of a Breast Cancer Recurrence Model Demonstrates Accurate Identification of Patients with Favorable Long-Term Outcomes [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO1-01-11.
Read full abstract