Abstract Introduction: Patients with basal-like breast cancer (BLBC) predominantly represented by triple-negative breast cancer have shown a high recurrence rate and are characterized by poor prognosis. There is an urgent need to undercover reliable prognostic biomarkers that can help in the clinical management of such patients and identify additional therapeutic targets. The objective of this study was to create a comprehensive transcriptomic database on a large scale and leverage it to identify and prioritize cancer-related genes associated with BLBC patients’ outcomes. Methods: We identified breast cancer cohorts from public repositories that contained gene expression data at the transcriptome level, along with clinical follow-up information. BLBC were identified using the PAM50 signature. All samples were standardized using a standard array normalization coupled with scaling to have a mean expression across all genes of 1000 in each sample and incorporated into a unified database. Redundant samples were removed. For each gene, Cox univariate survival analysis was conducted, to account for multiple hypothesis testing, the false discovery rate was computed, and a significant cutoff of 1% was employed to determine the highest statistical significance. Association with RFS and OS was performed. Multivariate analysis was performed for selected genes involving clinical and pathological variables. To uncover higher-level functions related to altered RFS, gene ontology analysis was performed using the enrichGO function in the TNM plotter (http://www.tnmplot.com). Results: The complete integrated database comprises 1,899 samples from 52 breast cancer datasets. Altogether, 2,342 genes were correlated with relapse-free survival (RFS), and 1,149 genes were correlated with overall survival (OS). 619 genes were statistically significant for both RFS and OS. The most significant genes were ANGPTL4 (p=4.25E-08, HR=2.02), NHP2 (p=5.98E-10, HR=1.93), STK3 (p=4.86E-10, HR=1.93), GBE1 (p=2.77E-09, HR=1.86), and PMVK (p=3.65E-09, HR=1.85) for RFS and PINK1 (p=1.64E-05 , HR=3.31), CAMK2N1 (p=1.06E-07 , HR=2.93), CACFD1 (p=4.79E-04 , HR=2.61), SCAP (p=3.29E-04 , HR=2.6), SDC1 (p=2.81E-04 , HR=2.57), for OS. The most significant gene ontology biological processes upregulated in tumors with a worse prognosis include GO:0000184, nuclear-transcribed mRNA catabolic process, nonsense-mediated decay (p=6.64E-18); GO:0045047 , protein targeting to ER (p=6.64E-18); GO:0006614, SRP-dependent cotranslational protein targeting to membrane (p=9.08E-18); GO:0072599, establishment of protein localization to endoplasmic reticulum (p=1.30E-17); and GO:0006613, cotranslational protein targeting to membrane (p=1.90E-17). Conclusions: Our results help to prioritize genes and to neglect those which are most likely to fail in studies aiming to establish new clinically useful biomarkers and therapeutic targets in BLBC. Citation Format: Balazs Gyorffy, Libero Santarpia. Uncovering Novel Potential Prognostic Biomarkers in Basal-Like Breast Cancer using Transcriptomic Data of 1,899 Patients [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO2-03-08.
Read full abstract