Abstract

Background: Breast cancer is intrinsically heterogeneous and is commonly classified into four main subtypes associated with distinct biological features and clinical outcomes. However, currently available data resources and methods are limited in identifying molecular subtyping on protein-coding genes, and little is known about the roles of long non-coding RNAs (lncRNAs), which occupies 98% of the whole genome. lncRNAs may also play important roles in subgrouping cancer patients and are associated with clinical phenotypes. Methods: The purpose of this project was to identify lncRNA gene signatures that are associated with breast cancer subtypes and clinical outcomes. We identified lncRNA gene signatures from The Cancer Genome Atlas (TCGA )RNAseq data that are associated with breast cancer subtypes by an optimized 1-Norm SVM feature selection algorithm. We evaluated the prognostic performance of these gene signatures with a semi-supervised principal component (superPC) method. Results: Although lncRNAs can independently predict breast cancer subtypes with satisfactory accuracy, a combined gene signature including both coding and non-coding genes will give the best clinically relevant prediction performance. We highlighted eight potential biomarkers (three from coding genes and five from non-coding genes) that are significantly associated with survival outcomes. Conclusion: Our proposed methods are a novel means of identifying subtype-specific coding and non-coding potential biomarkers that are both clinically relevant and biologically significant.

Highlights

  • Targeted therapies significantly contribute to efforts toward personalized approaches for the treatment of breast cancer, one of the most aggressive and prevalent diseases in women [1].In 2017, an estimated 252,710 new cases of invasive breast cancer are expected to be diagnosed in women in the U.S, along with 63,410 new cases of non-invasive breast cancer [2]

  • The prediction accuracies for distinguishing subtypes were evaluated by 10-fold cross-validation using a 2-Norm support vector machines (SVMs) classifier

  • Since the PAM50 gene signature, which was initially derived from a microarray assay, has been widely used in clinical practice for breast cancer subtype prediction [20], we first tested our method on the PAM50 gene signature from the The Cancer Genome Atlas (TCGA) microarray dataset

Read more

Summary

Introduction

Targeted therapies significantly contribute to efforts toward personalized approaches for the treatment of breast cancer, one of the most aggressive and prevalent diseases in women [1].In 2017, an estimated 252,710 new cases of invasive breast cancer are expected to be diagnosed in women in the U.S, along with 63,410 new cases of non-invasive (in situ) breast cancer [2]. Current classification methods of breast cancer subtypes are limited to protein-coding genes (PCGs), despite the fact that the non-coding region occupies 98% of the whole genome and plays a regulatory role for PCGs [8,9,10]. Currently available data resources and methods are limited in identifying molecular subtyping on protein-coding genes, and little is known about the roles of long non-coding RNAs (lncRNAs), which occupies 98% of the whole genome. Methods: The purpose of this project was to identify lncRNA gene signatures that are associated with breast cancer subtypes and clinical outcomes. Results: lncRNAs can independently predict breast cancer subtypes with satisfactory accuracy, a combined gene signature including both coding and non-coding genes will give the best clinically relevant prediction performance. Conclusion: Our proposed methods are a novel means of identifying subtype-specific coding and non-coding potential biomarkers that are both clinically relevant and biologically significant

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.