Abstract Breast cancers (BCs) are segregable into subtypes based on their intrinsic molecular features. When evaluating HER2 status, clinicians often rely on IHC, which is plagued by reproducibility issues. The widely used PAM50 transcriptomic system also does not include the HER2-low subtype. Addressing these limitations is particularly important given the recent approval of targeted therapy for HER2-low BCs. Here, we developed a unique transcriptomic classification system that can identify HER2-low BCs. Accurate identification of HER2-low BCs can help physicians navigate treatment selection and, in turn, allow patients to benefit from the new targeted therapy. We analyzed 6,361 BC samples with clinical and pathological annotation, including FISH and IHC data, from three public datasets (TCGA BRCA and SCAN-B [total n = 4,457]; METABRIC [n = 1,904]) for gene expression, differential expression, gene enrichment, copy number alterations, and mutations. An algorithm was applied to categorize samples from TCGA and SCAN-B into subtypes by customizing a gene set, employing UMAP dimensionality reduction, and utilizing clustering with the HDBSCAN algorithm. Each resulting cluster that was associated with a specific gene set was named and excluded from the next analysis. After four iterations, five subtypes were sequentially identified: Basal, HER2-high, HER2-low, Luminal B (LumB), and Luminal A (LumA). The Light GBM-based hierarchical classifier model was trained on the TCGA and SCAN-B cohorts and validated on the METABRIC dataset. Samples with high ERBB2 expression and copy number amplification were classified as HER2-high, and samples with similar PAM50 expression profiles but lacking ERBB2 expression were considered HER2-low. Luminal samples were grouped into two subtypes: LumA, characterized by low proliferation rates, and LumB, characterized by high proliferation rates. HER2-high, Basal, LumA, and LumB subtypes exhibited expected mutational and gene expression profiles. We observed ERBB2 copy number amplification and high gene expression in HER2-high samples from the TCGA and METABRIC cohorts, as confirmed by FISH and IHC in 97% and 93% of samples, respectively. Conversely, we did not see high ERBB2 expression or copy number amplification in the HER2-low samples, which was concordant with an absence of HER2 3+ expression by IHC. HER2-low samples had high mutational frequencies in PTEN, KMT2C, and ERBB2 (12%, 14%, and 8% respectively, adj. p < 0.005, chi-square test). HER2-low samples also showed increased EGFR (logFC = 2.8, adj. p < 0.001) and CLDN8 (logFC = 3.4, adj. p < 0.001) expression levels. The novel classifier identified 55% of the HER2-low samples as the luminal androgen receptor (LAR) Burstein subtype, as a result of higher AR expression than in HER2-high samples (logFC = 2.9, p = 0.04). Classification of SCAN-B luminal samples into the LumA and LumB subtypes also better predicted Ki-67 positivity (≥ 20% of Ki-67+ malignant cells by IHC, F1-score = 0.77) than the previously reported PAM50 classification (F1-score = 0.62). Our refined classifier effectively defined five breast cancer subtypes, including the HER2-low subtype, based on their molecular profiles. By determining HER2 status with transcriptomic data, our classifier may help minimize IHC-related reproducibility issues and, in turn, facilitate more precise treatment decision-making. Given its concordance with copy number amplification, IHC, and FISH data, our classification system can distinguish between HER2-high and HER2-low BCs. Since HER2-low BCs may represent a unique group that requires different treatment approaches, further optimization of our classifier may assist in identifying effective therapies for patients with HER2-low BCs. Citation Format: Polina Turova, Vladimir Kushnarev, Oleg Baranov, Anna Butusova, Sofia Menshikova, Sheila Yong, Anna Love, Konstantin Chernyshov, Nikita Kotlov. Novel Transcriptomic Classification System Unveils HER2-Low Breast Tumors [abstract]. In: Proceedings of the 2023 San Antonio Breast Cancer Symposium; 2023 Dec 5-9; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2024;84(9 Suppl):Abstract nr PO2-15-12.
Read full abstract