Abstract Higher standards for precise distinction and classification among the four breast cancer subtypes, i.e., Luminal A. Luminal B, HER-2-enriched, and Basal, are essential to delivering customized treatment in precision oncology. The effective integration of readily accessible multi-omics datasets from the same patient may improve the accuracy of subtype prediction algorithms and help decipher reliable molecular features for the prediction and prognosis of breast cancer. Previous models for predicting breast cancer subtypes relied mostly on concatenation and autoencoder methods but rarely utilized graph-based integration to leverage the biological associations among omics datatypes. To overcome these limitations, we developed a graph-based multi-omics integration method using features from mRNA, DNA methylation, and miRNA data and synthesized features from their interactions. GAIN-BRCA computes weightage scores from miRNA-mRNA and DNA methylation-mRNA interactions to derive a new transformed feature vector without compromising biological information. The neural network architecture was designed and trained on the transformed features to classify breast cancer subtypes. Neural network prediction accuracies were compared using concatenated, autoencoder, and GAIN-BRCA-based integrated datasets. GAIN-BRCA outperforms the existing similar models MOGONET (Acc: 0.72) and moBRCA-net (Acc: 0.86) with an accuracy of 0.91. These transformed features were then processed using SHAP, and the top 357 features were selected for functional characterization and pathway analysis. We identified specific pathways and genes related to each breast cancer subtype separately. The most enriched pathways found in each subtype are ribonucleotide reductase signaling and glucocorticoid signaling pathways in luminal A, PFKB4 signaling pathway in luminal B, adipogenesis and DNA methylation and transcriptional signaling in HER2+, and the role of OCT4 in mammalian embryonic stem cell pluripotency in basal. Similarly, the most discriminated gene features are BRAF, ESR1, KRAS, MAPK1, SMARCE1, PPP3CA, and TGFBR2 in luminal A, TGFB3 and NCOA3 in luminal B, HNF4A, SIN3A, and SOX9 in HER2+, and POU5F1 and FOXA2 in basal subtypes. The GAIN-BRCA framework, combined with SHAP, weighs independent omics as cross-disciplinary rather than multidisciplinary and presents a distinctive picture for each subtype. As a result, the framework presents the importance of features about the corresponding subtype, which might lead to a greater understanding of the molecular basis of breast cancer subtypes. Citation Format: Jai Chand Patel, Sushil Kumar Shakyawar, Sahil Sethi, Chittibabu Guda. GAIN-BRCA: A graphical explainable AI-net framework for breast cancer subtype classification using multiomics data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 3526.
Read full abstract