Abstract Background: Adjuvant chemotherapy is a common treatment for breast cancer patients with aggressive tumors, but imposes severe side effects. Although lymph node involvement is associated with higher likelihood of mortality, a subset of LN+ patients can survive without relapse, even without adjuvant chemotherapy. SWOG-8814 (S8814) was a randomized clinical trial of hormone receptor positive, LN+ post-menopausal women (pathologic stage T1-3N1-2) to compare patient outcome between endocrine therapy alone (tamoxifen), chemotherapy (cyclophosphamide, doxorubicin, and 5-fluorouracil) followed by 5 years of tamoxifen, and chemotherapy with concurrent tamoxifen. In this work, we evaluated the ability of a machine learning approach called multiple instance learning (MIL) to predict long-term overall (OS) and disease-free survival (DFS) in patients from the S8814 clinical trial via analysis of digitized H&E tissue slides. Methods: The training set (St, n=121) consisted of digitized H&E Whole Slide Images (WSIs) of ER+ LN+ patients from ECOG-2197 - a clinical trial to compare patient outcome under chemotherapy with doxorubicin/docetaxel vs. doxorubicin/cyclophosphamide. For each WSI, the nuclei of ten tumor patches (2000 × 2000 pixels) were automatically segmented by a pretrained machine learning algorithm called Convolutional Neural Networks (CNN). The results of the CNN’s segmentations were used to obtain ten corresponding feature vectors of 3963 features relating to nuclear morphology and spatial arrangement. Five top discriminative features (Haralick texture, Delaunay triangulation, orientation entropy, perimeter kurtosis, architecture) were identified by reliefF-MI feature filtering and forward selection on St. A Normalized Set Kernel Support Vector Machine classifier was trained on St to construct a MIL binary classifier (M+), which generates a continuous survival score to predict which patients would have >10-years of DFS and OS. The score cutoff thatmaximized F1 score on St was used to generate binary predictions of patient outcome. Blinded validation of the risk scores on S8814 was performed by SWOG. Results: M+ was significantly prognostic of OS in univariate analysis (HR=0.72, p=0.020, 95% CI= 0.55 – 0.95) on S8814. M+ stratifications remained statistically significant after multivariable analysis controlling for treatment, number of positive lymph nodes, and tumor size (HR = 0.71, p=0.017, 95% CI=0.54-0.94). M+ predictions did not achieve statistical significance for DFS in univariate analysis (HR=0.81, p=0.081, 95% CI 0.64-1.03) nor multivariable analysis (HR=0.81, p=0.080, 95% CI 0.63-1.03). M+ was not predictive of chemotherapy benefit for OS (p=0.88) and DFS (p=0.65). There was no correlation of the generated continuous score with Recurrence Score (r=-0.05), previously shown to be a predictive factor for chemotherapy benefit. Conclusion: Our findings demonstrate that nuclear morphology provides insight into overall survival in ER+ LN+ breast cancer patients. Future work will involve combining the MIL based morphologic predictor with other clinicopathologic factors to improve DFS risk stratification. Funding: Funding NIH/NCI grants U10CA180888, U10CA180819, U24CA196175; and in part by Genomic Health, INC. (now Exact Sciences Corp.) Citation Format: Daniel Shao, William E. Barlow, Haojia Li, Cheng Lu, Kathy S. Albain, James Rae, Daniel F. Hayes, Andrew K. Godwin, Alastair M. Thompson, Anant Madabhushi, Lajos Pusztai. Computer analysis of nuclear morphology with Multiple Instance Learning Predicts Overall Survival for Node Positive Breast Cancer Patients from SWOG S8814: A Blinded Validation Study [abstract]. In: Proceedings of the 2022 San Antonio Breast Cancer Symposium; 2022 Dec 6-10; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2023;83(5 Suppl):Abstract nr P2-11-11.
Read full abstract