Abstract

Abstract Background: RNA-Seq is a powerful technology for accurately quantifying the transcriptome of human cancers. The human breast cancer section of the TCGA has generated RNA-Seq data for over 950 samples from primary tumors and adjacent normal tissues. Analysis of these data has provided insights into transcript isoforms and gene fusions related to the etiology of breast cancer. These rich data no doubt contain other insights into the expression of cancer genes and how it relates to cancer clinical outcomes. Methods: TCGA RNA-Seq data were downloaded from TCGA data portal. A total of 712 samples were available when this project started, including data from 607 primary breast tumors of which 108 were Basal-like, 316 Luminal A, 139 Luminal B, and 54 HER2-enriched. Data from 65 adjacent normal tissues were also included to be used as controls. We examined the exon-level expression for a group of 137 strongly cancer related genes that can be grouped into 12 pathways (Vogelstein et al. 2013). We focused on exons that display a significant (p <0.05) difference in their expression variation, compared to the other exons of the same gene (> 2 fold changes in terms of standard deviation based on bootstrap resampling). We refer to these exons as HVEEs (Highly variably expressed exons). Using both unsupervised and supervised algorithms (e.g. non-negative matrix factorization), we classified the samples from various subtypes based on the expression patterns of the HVEEs and looked for patterns that associate with clinical outcomes. Results: For many cancer genes, such as BRCA2 and PTEN, the distribution of within-gene exon expression is similar across all tumor samples and the control group. Subtype-specific amplification of oncogenes (e.g. ERBB2 in the HER2 subtype) and lower expression of cancer genes (e.g. AR in the basal subtype) have also been observed. We observed that exon expression across genes and across samples has in general small variations. Out of the 2,153 exons from the 137 cancer genes, we identified only 41 genes with HVEEs. Most of these genes have only one HVEE but a small number of them have multiple HVEEs (48 in total). Interestingly, HVEEs in 31 of these genes are consistent across the four breast cancer subtypes, as well as the normal group. However, the medians of such HVEE expression across the normal samples are in general significantly lower than those across the tumor samples in any tumor subtypes. Further, using the exon expression data of only the HVEEs, we identified distinct clusters for the basal-like subtype. The 108 basal-like subtype samples can be classified into two clusters with 41 and 67 samples, respectively, based on the expression level of the HVEEs from the above mentioned 31 genes. Overall survival for patients in these two clusters trends toward significance (log-rank test p = 0.058). Discussion: The consistency of HVEEs across breast cancer and the normal samples and the ability of the HVEE expression level to stratify basal-like breast cancer in a clinically meaningful way is intriguing. We are extending the study to a larger dataset for validation. The views expressed in this abstract are those of the authors and do not reflect the official policy of the Department of Defense, or US Government. Citation Information: Cancer Res 2013;73(24 Suppl): Abstract nr P4-04-16.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.