Breast cancer is the most common malignancy affecting women worldwide and is notable for its morphologic and biologic diversity, with varying risks of recurrence following treatment. The Oncotype DX Breast Recurrence Score test is an important predictive and prognostic genomic assay for estrogen receptor positive/HER2 negative breast cancer that guides therapeutic strategies; however, such tests can be expensive, delay care, and are not widely available. The aim of this study was to develop a multi-model approach integrating the analysis of whole-slide images and clinicopathologic data to predict their associated breast cancer recurrence risks and categorize these patients into two risk groups according to the predicted score: low-risk and high-risk. The proposed novel methodology uses convolutional neural networks for feature extraction and vision transformers for contextual aggregation, complemented by a logistic regression model that analyzes clinicopathologic data for classification into two risk categories. This method was trained and tested on 956 hematoxylin and eosin-stained whole-slide images of 950 ER+/HER2- breast cancer patients with corresponding clinicopathological features that had prior Oncotype DX testing. The model's performance was evaluated using an internal test set of 192 patients from Dartmouth Health and an external test set of 405 patients from the University of Chicago. The multi-model approach achieved an AUC of 0.91 (95% CI: 0.87-0.95) on the internal set and an AUC of 0.84 (95% CI: 0.78-0.89) on the external cohort for predicting low- and high-breast cancer recurrence risk categories based on the Oncotype DX recurrence score. With further validation, the proposed methodology could provide an alternative to assist clinicians in personalizing treatment for breast cancer patients and potentially improving their outcomes.
Read full abstract