Abstract

In cancer genomes, there are frequent copy number aberration (CNA) events, some of which are believed to be tumorigenic. While copy numbers can be detected by a number of technologies, e.g., SNP arrays, their relations with gene expressions are not well clarified. Here, we describe an approach to visualize the global relations between copy numbers and gene expressions using expression microarrays. We mapped the gene expression signals detected by microarray probesets onto a reference human genome, the RefSeq, based on their annotated physical positions, resulting in a landscape that we called expressogram. To study the expressograms under various conditions and their relations with cytogenetic events, such as CNAs, we obtained three classes of array samples, namely samples of a cancer (e.g., liver cancer), normal samples in the same tissue, and normal samples of other tissues. We developed a Bayesian based algorithm to estimate a background signal from the latter two sources for the cancer samples. By subtracting the estimated background from the raw signals of the cancer samples, and subjecting the differences to a kernel-based smoothing scheme, we produced an expressogram that shows strong consistency with the copy numbers. This indicates that copy numbers are on average positively correlated with and have strong impacts on gene expressions. To further explore the applicability of these findings, we submit the expressograms to the significant CNA detection algorithm GISTIC. The results strongly indicate that expressogram can also be used to infer copy number events and significant regions of CNA affected dysregulation.

Highlights

  • The copy numbers of genes in normal somatic chromosomes are assumed to be two, i.e., one copy from father and the other from mother

  • The results strongly indicate that expressogram can be used to infer copy number events and significant regions of copy number aberration (CNA) affected dysregulation

  • We have described a novel visualization of gene expressions in the cancer genomes

Read more

Summary

Introduction

The copy numbers of genes in normal somatic chromosomes are assumed to be two, i.e., one copy from father and the other from mother. While individual copy number changes may cause a gene to be either up or down regulated [6], some studies [7] suggest that copy numbers do positively affect gene expressions If the latter holds in the general settings, it means that we may be able to visualize the gene expression landscape, or as we called it, the expressogram, of a sample or a group of samples, with respect to their cytogenetic profiles, i.e., the genome-wide copy number measurements. Conventional CNA inferences are mostly based on array CGH, SNP arrays, etc., but some of them suffer from errors [9] This visualization technique may serve as an independent source of measurements to help confirm that certain regions are real CNAs. Fourth, instead of relying on SNP arrays for detection of recurrent CNAs for search of potential cancer causing events, the expressogram signals may be used to search for genes that are directly and recurrently affected by copy numbers. The following sections discuss the algorithms and results of this approach

Algorithm
Background Estimation
Subtracting Background and Smoothing the Signals
Results
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.