Abstract

With advances in next-generation sequencing(NGS) technologies, a large number of multiple types of high-throughput genomics data are available. A great challenge in exploring cancer progression is to identify the driver genes from the variant genes by analyzing and integrating multi-types genomics data. Breast cancer is known as a heterogeneous disease. The identification of subtype-specific driver genes is critical to guide the diagnosis, assessment of prognosis and treatment of breast cancer. We developed an integrated frame based on gene expression profiles and copy number variation (CNV) data to identify breast cancer subtype-specific driver genes. In this frame, we employed statistical machine-learning method to select gene subsets and utilized an module-network analysis method to identify potential candidate driver genes. The final subtype-specific driver genes were acquired by paired-wise comparison in subtypes. To validate specificity of the driver genes, the gene expression data of these genes were applied to classify the patient samples with 10-fold cross validation and the enrichment analysis were also conducted on the identified driver genes. The experimental results show that the proposed integrative method can identify the potential driver genes and the classifier with these genes acquired better performance than with genes identified by other methods.

Highlights

  • Breast cancer is one of the most common malignant tumors in women

  • The performance is compared with the information gain, Chi-squared and lemon-tree methods; (3) analyzing biological significance of the obtained driver genes, including topology-based pathway analysis, Gene ontology (GO) functional enrichment, KEGG

  • We introduced a module-based framework by integrating transcriptome and genomic data to identify significant driver genes in breast cancer subtypes

Read more

Summary

Introduction

Breast cancer is one of the most common malignant tumors in women. Of all kinds malignant tumors which is usually associated with genetic alterations [1]. Breast cancer has been categorized into five subtypes, including luminal A (LumA), luminal B (LumB), HER2-enriched (HER2), basal-like (Basal), and normal-like (Normal) types. Previous studies have shown that each cancer subtype has its own gene imprint and tumor markers, and genetic variation will increase the risk of cancer. Not all of the aberrations have the same impact on tumor progression. To understand the mechanism of cancer, identifying driver genes from genomic aberrations has become the focus of research. The gene expression profiles play an important role in understanding the pathogenesis of disease. The gene expression profiles provide information about their activity level

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call