Abstract

BackgroundDiscovering genetic associations between genetic markers and gene expression levels can provide insight into gene regulation and, potentially, mechanisms of disease. Such analyses typically involve a linkage or association analysis in which expression data are used as phenotypes. This approach leads to a large number of multiple comparisons and may therefore lack power. We assess the potential of applying canonical correlation analysis to partitioned genomewide data as a method for discovering regulatory variants.Methodology/Principal FindingsSimulations suggest that canonical correlation analysis has higher power than standard pairwise univariate regression to detect single nucleotide polymorphisms when the expression trait has low heritability. The increase in power is even greater under the recessive model. We demonstrate this approach using the Childhood Asthma Management Program data.Conclusions/SignificanceOur approach reduces multiple comparisons and may provide insight into the complex relationships between genotype and gene expression.

Highlights

  • The usefulness of examining associations between genetic markers and gene expression is due to the immediate and direct relationship between the gene expression phenotype and DNA sequence variation

  • Canonical correlation analysis (CCA) cannot be applied to all single nucleotide polymorphisms (SNPs) and expression probes in a genomewide association study since the number of variables is greater than the number of subjects

  • Under the alternative hypothesis of association, the power to detect a significant correlation with Bartlett’s test is compared with the power to detect the simulated association by regressing the expression quantitative trait locus of interest on the number of copies of the risk allele using Bonferroni correction to adjust for 60 pairwise tests (Table 2)

Read more

Summary

Introduction

The usefulness of examining associations between genetic markers and gene expression is due to the immediate and direct relationship between the gene expression phenotype and DNA sequence variation. CCA finds a linear combination of the genotypes and a linear combination of the expression levels such that the correlation between the two is maximized As it is, CCA cannot be applied to all SNPs and expression probes in a genomewide association study since the number of variables is greater than the number of subjects. Two modifications of CCA have recently been proposed for use with genetic marker and gene expression data: penalized CCA [3] and sparse CCA [4] These methods are computationally intensive and are sometimes sensitive to starting parameters. Discovering genetic associations between genetic markers and gene expression levels can provide insight into gene regulation and, potentially, mechanisms of disease Such analyses typically involve a linkage or association analysis in which expression data are used as phenotypes. We assess the potential of applying canonical correlation analysis to partitioned genomewide data as a method for discovering regulatory variants

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call