Abstract

BackgroundElucidating the exact relationship between gene copy number and expression would enable identification of regulatory mechanisms of abnormal gene expression and biological pathways of regulation. Most current approaches either depend on linear correlation or on nonparametric tests of association that are insensitive to the exact shape of the relationship. Based on knowledge of enzyme kinetics and gene regulation, we would expect the functional shape of the relationship to be gene dependent and to be related to the gene regulatory mechanisms involved. Here, we propose a statistical approach to investigate and distinguish between linear and nonlinear dependences between DNA copy number alteration and mRNA expression.ResultsWe applied the proposed method to DNA copy numbers derived from Illumina 109 K SNP-CGH arrays (using the log R values) and expression data from Agilent 44 K mRNA arrays, focusing on commonly aberrated genomic loci in a collection of 102 breast tumors. Regression analysis was used to identify the type of relationship (linear or nonlinear), and subsequent pathway analysis revealed that genes displaying a linear relationship were overall associated with substantially different biological processes than genes displaying a nonlinear relationship. In the group of genes with a linear relationship, we found significant association to canonical pathways, including purine and pyrimidine metabolism (for both deletions and amplifications) as well as estrogen metabolism (linear amplification) and BRCA-related response to damage (linear deletion). In the group of genes displaying a nonlinear relationship, the top canonical pathways were specific pathways like PTEN and PI13K/AKT (nonlinear amplification) and Wnt(B) and IL-2 signalling (nonlinear deletion). Both amplifications and deletions pointed to the same affected pathways and identified cancer as the top significant disease and cell cycle, cell signaling and cellular development as significant networks.ConclusionsThis paper presents a novel approach to assessing the validity of the dependence of expression data on copy number data, and this approach may help in identifying the drivers of carcinogenesis.

Highlights

  • Elucidating the exact relationship between gene copy number and expression would enable identification of regulatory mechanisms of abnormal gene expression and biological pathways of regulation

  • The number of significantly correlated pairs was 15,654 at FDR10%, 13,085 for FDR5% (Figure 1), and 9,294 for FDR1%; the vast majority being positive correlation as expected, since a deletion most likely would lead to a decrease in mRNA expression and an amplification to increased mRNA expression

  • A significant linear term according to model (1) was observed for 2,004 loci of the amplified genes (Amp) and 1,350 loci for the deleted genes (Del) based on FDR5%

Read more

Summary

Introduction

Elucidating the exact relationship between gene copy number and expression would enable identification of regulatory mechanisms of abnormal gene expression and biological pathways of regulation. We propose a statistical approach to investigate and distinguish between linear and nonlinear dependences between DNA copy number alteration and mRNA expression. An early study of Hyman et al [6] on breast cancer cell lines using aCGH reported that 44% of the highly amplified genes were over-expressed and 10.5% of the highly over-expressed genes were amplified. These genes include known oncogenes and potential therapeutic targets. Another early aCGH study on breast cancers [7] found that 62% of the highly amplified genes showed moderately or highly elevated expression. The above attempts to quantify the amount of RNA that may result from an aberrant DNA locus applied a linear regression to model the dependence of the expression data on the copy number data

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call