Abstract

Inter-individual variation in gene expression levels can arise as an effect of variation in DNA markers. When associating multiple gene expression variables with multiple DNA marker variables, multivariate techniques, such as canonical correlation analysis, should be used to deal with the effect of co-regulating genes. We adapted the elastic net, a penalized approach proposed for variable selection in regression context, to canonical correlation analysis. The number of variables within each canonical component could be greatly reduced without too much loss of information, so the canonical components become easier to interpret. Another advantage is that it groups co-regulating genes, so that they end up in the same canonical components. Furthermore, our adaptation works well in situations where the number of variables greatly exceeds the number of subjects.

Highlights

  • Inter-individual variation in gene expression is due to differences in experimental, environmental, and biological factors

  • In this paper we describe the use of a newly developed canonical correlation analysis (CCA) to estimate the association between gene expression variables and DNA marker variables, in which we employed the elastic net [1] to simplify interpretation of the CCA components

  • Interset correlation of single gene expression variables with single single-nucleotide polymorphisms (SNPs) dummy variables varied between -0.56 and 0.46, so the dummy variables of single SNPs did show some effect on the gene expression levels

Read more

Summary

Introduction

Inter-individual variation in gene expression is due to differences in experimental, environmental, and biological factors. Many details are still unknown, molecular research has shown that expression of genes is regulated by the expression of many other genes in a sometimes highly complex network This means that expression of genes should not be analyzed separately, and if the association of gene expression with DNA markers is estimated, this should be done jointly. One of the aims of these techniques is to explain the variation of many genes by a much smaller set of components that are sometimes called latent genes; these may coincide with regulatory networks. These components are weighted combinations of the original gene expression variables or the DNA marker variables, and these weights are inspected to interpret the components/latent genes.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.