Abstract

Molecular classification of breast cancer into clinically relevant subtypes helps improve prognosis and adjuvant-treatment decisions. The aim of this study is to provide a better characterization of the molecular subtypes by providing a comprehensive landscape of subtype-specific isoforms including coding, long non-coding RNA and microRNA transcripts. Isoform-level expression of all coding and non-coding RNAs is estimated from RNA-sequence data of 1168 breast samples obtained from The Cancer Genome Atlas (TCGA) project. We then search the whole transcriptome systematically for subtype-specific isoforms using a novel algorithm based on a robust quasi-Poisson model. We discover 5451 isoforms specific to single subtypes. A total of 27% of the subtype-specific isoforms have better accuracy in classifying the intrinsic subtypes than that of their corresponding genes. We find three subtype-specific miRNA and 707 subtype-specific long non-coding RNAs. The isoforms from long non-coding RNAs also show high performance for separation between Luminal A and Luminal B subtypes with an AUC of 0.97 in the discovery set and 0.90 in the validation set. In addition, we discover 1500 isoforms preferentially co-expressed in two subtypes, including 369 isoforms co-expressed in both Normal-like and Basal subtypes, which are commonly considered to have distinct ER-receptor status. Finally, analyses at protein level reveal four subtype-specific proteins and two subtype co-expression proteins that successfully validate results from the isoform level.

Highlights

  • One in eight women will develop an invasive breast cancer during their lifetime and, despite the implementation of screening and prevention programs [1], more than 131,000 women died of breast cancer in Europe in 2012 [2] and approximately 40,000 deaths are expected in the United States in 2016 [3]

  • An isoform is said to be specific to a single subtype if it satisfies these two conditions: (i) it is significantly over-expressed in that subtype compared to all the other subtypes, and (ii) the other subtypes cannot be separated based on that isoform

  • We further investigate isoforms belonging to genes lactate dehydrogenase B (LDHB) and FST from the top 12 isoforms, as they have been previously reported at gene level

Read more

Summary

Introduction

One in eight women will develop an invasive breast cancer during their lifetime and, despite the implementation of screening and prevention programs [1], more than 131,000 women died of breast cancer in Europe in 2012 [2] and approximately 40,000 deaths are expected in the United States in 2016 [3]. These, respectively, represent the first and second most-common cause of cancer-related deaths among women. Reasons include the fact that breast cancer is such a complex and heterogeneous disease in terms of molecular alterations and clinical outcomes [4] that it should be considered not as a single disease but rather as a group of molecularly distinct neoplasms [5]. In the last decade many studies have investigated the distinct breast-cancer subtypes through their characteristic molecular profiles, and their clinical correlation to prognosis and response to therapy. These molecularly defined subtypes differ in expression of well-known and therapeutically important receptors: estrogen receptor (ER) and human epidermal growth factor receptor 2 (HER2). Based on gene expression signatures at least five independent intrinsic molecular subtypes: Normal-like, Luminal A and B (mostly ER+), Basal (mostly ER- and HER2-) and Her2/ERBB2

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call