Abstract

BackgroundOne of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development. Many valuable studies lack genomic data to detect CNV; therefore, methods that are able to infer CNVs from gene expression data would help maximize the value of these studies.ResultsWe developed a framework for identifying recurrent regions of CNV and distinguishing the cancer driver genes from the passenger genes in the regions. By inferring CNV regions across many datasets we were able to identify 109 recurrent amplified/deleted CNV regions. Many of these regions are enriched for genes involved in many important processes associated with tumorigenesis and cancer progression. Genes in these recurrent CNV regions were then examined in the context of gene regulatory networks to prioritize putative cancer driver genes. The cancer driver genes uncovered by the framework include not only well-known oncogenes but also a number of novel cancer susceptibility genes validated via siRNA experiments.ConclusionsTo our knowledge, this is the first effort to systematically identify and validate drivers for expression based CNV regions in breast cancer. The framework where the wavelet analysis of copy number alteration based on expression coupled with the gene regulatory network analysis, provides a blueprint for leveraging genomic data to identify key regulatory components and gene targets. This integrative approach can be applied to many other large-scale gene expression studies and other novel types of cancer data such as next-generation sequencing based expression (RNA-Seq) as well as CNV data.

Highlights

  • One of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development

  • Performance Comparison of waveletbased ACE algorithm (WACE) and GACE The original ACE approach for identifying amplified or deleted chromosome regions used the simple Gaussian transform to smooth the data and identified the significantly abnormal regions comprised of over- or under-expressed genes via a permutation test[13]. This approach, named GACE, was able to narrow down genes whose expression might be affected by the local CNV, it often systematically overestimated the size of the identified regions which were typically rearrangements of small sequences

  • We summarize the findings: (i) WACE uncovered almost three times as many expression inferred CNV (ICNV) regions overlapping with the aCGH ICNV regions compared to GACE, and (ii) these two sets of regions identified by WACE were better correlated with each other than those identified by GACE

Read more

Summary

Introduction

One of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development. Extensive gene expression studies have been conducted for identifying tumor signature genes associated with poor outcome [5,6], the reproducibility of these signatures is low[7,8], Genome-wide DNA copy number variation (CNV) has been increasingly used for identifying biomarkers and targets in cancer research[9,10,11]. The Analysis of Copy number alteration by Expression (ACE) algorithm was developed to identify amplified or deleted chromosome regions based on gene expression data[13]. While this approach demonstrated the utility of leveraging expression data to identify candidate CNV regions and genes whose expressions might be affected by the candidate CNV, the identified regions were often large and harbored many genes. No objective mechanism was employed to distinguish cancer driver genes from passenger genes within a putative CNV region

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.