Abstract Background: Machine learning (ML) in translational medicine has led to prediction of clinical outcomes and identification of new biomarkers. We employ ML in prediction of pathologic complete response (pCR) in high-risk breast cancer patients in the neoadjuvant I-SPY2 TRIAL where not all novel agents have strong predictive biomarkers. Leveraging a ML approach using progressively expanded candidate genes, we explore the limitations of using only known mechanisms of action in predicting pCR, and the extent to which biology outside known drug action improves response prediction in the first 10 arms of the trial. Methods: ML random forest models were developed in I-SPY2 patients (n=982) with pre-treatment gene expression and pCR data across 10 treatment arms (PMID: 35623341), including inhibitors of HER2: neratinib (N), pertuzumab (P), TDM1/P; AKT (MK-2206); IGF1R (ganitumab); HSP90 (ganetespib); PARP/DNA repair (veliparib/carboplatin, VC); ANG1/2 (trebananib, T); immune checkpoints (PD1-inh); and Control (Ctr). Each HR/HER2 receptor/treatment arm subset (m=27) was evaluated independently. We employed a three-pronged feature-selection approach using (1) genes restricted to known mechanism of action of individual I-SPY2 agents (k=10 to 88 genes); (2) genes expanded to include targeted pathways for all 10 agents/combinations (k=282); and (3) an unbiased whole genome approach (k=17,990). Samples were partitioned with 75% used for training and cross-validation, and 25% held out as test sets. Predictive ML models were defined as those with performance ≥ 0.90 based on different performance metrics (e.g., AUC, sensitivity, specificity). Results: For each of the 27 subtype-treatment subsets, at least one high performing model was identified. In 6 subtype-treatment subsets, mechanism of action genes were sufficient to predict pCR: AKT/PI3K/HER genes in HR+HER2- N and HR-HER2+ P; DNA repair genes in HR+HER2- VC; angiogenesis-associated genes in HR+HER2+ T; and immune-associated genes in both HR+HER2- and HR-HER2- PD1-inh subsets. Expanded targeted pathway models were required to identify predictive models in 8 additional subtype-treatment pairs from the N, T-DM1/P, MK-2206, VC, T, and HER2+ Ctr arms, with significant contribution of DNA repair, immune, and HSP90 genes for multiple arms. A genome-wide approach was required for the remaining 13 subtype-treatment pairs with no previous models from the N, P, MK-2206, ganitumab, ganetespib, T, and HER2- Ctr arms. Even for subtype-treatment pairs where mechanism of action gene sets was sufficient for reasonable models, expanded gene sets resulted in improved performance. For instance, metabolism genes improved model performance for HR-HER2+ in N and Ctr, and for HR+HER2- in the PD1-inh arm; and mitochondrial and protein folding dysfunction genes improved response prediction in HR-HER2- in the ganetespib arm. Conclusion: Our study identifies mechanism of action biomarkers associated with response to each drug and elucidates possible off-target effects contributing to observed drug sensitivity and resistance. Citation Format: Rosalyn W. Sayaman, Denise M. Wolf, Christina Yau, Julia Wulfkhule, Emanuel F. Petricoin, Lamorna Brown-Swigart, Tam Binh Bui, Gillian L. Hirst, Diane Heditsian, W. Fraser Symmans, Angela DeMichele, Mark LaBarge, Laura J. Esserman, Laura van ‘t Veer. Machine learning elucidates biology of response within and outside the mechanisms of action of therapeutic agents in the I-SPY2 breast cancer TRIAL [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Advances in Breast Cancer Research; 2023 Oct 19-22; San Diego, California. Philadelphia (PA): AACR; Cancer Res 2024;84(3 Suppl_1):Abstract nr A066.
Read full abstract