Abstract

BackgroundThe impact of cell-composition effects in analysis of DNA methylation data is now widely appreciated. With the availability of a reference data set consisting of DNA methylation measurements on isolated cell types, it is possible to impute cell proportions and adjust for them, but there is increasing interest in methods that adjust for cell composition effects when reference sets are incomplete or unavailable.ResultsIn this article we present a theoretical basis for one such method, showing that the total effect of a phenotype on DNA methylation can be decomposed into orthogonal components, one representing the effect of phenotype on proportions of major cell types, the other representing either subtle effects in composition or global effects at focused loci, and that it is possible to separate these two types of effects in a finite data set. We demonstrate this principle empirically on nine DNA methylation data sets, showing that the first few principal components generally contain a majority of the information on cell-type present in the data, but that later principal components nevertheless contain information about a small number of loci that may represent more focused associations. We also present a new method for determining the number of linear terms to interpret as cell-mixture effects and demonstrate robustness to the choice of this parameter.ConclusionsTaken together, our work demonstrates that reference-free algorithms for cell-mixture adjustment can produce biologically valid results, separating cell-mediated epigenetic effects (i.e. apparent effects arising from differences in cell composition) from those that are not cell mediated, and that in general the interpretation of associations evident from DNA methylation should be carefully considered.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0527-y) contains supplementary material, which is available to authorized users.

Highlights

  • The impact of cell-composition effects in analysis of DNA methylation data is widely appreciated

  • Results of empirical evaluation of theoretical concepts Above we have argued that the total effect A of phenotype on DNA methylation can be decomposed into two orthogonal terms, Δk, the effect of a phenotype on the distribution of major cell types, and Bk, which represents the effect of the phenotype on subtle variants of one cell type or global effects focused at single loci

  • We argue in this paper that the most likely interpretation of linear associations is that of cell composition, a more general point can be made: that the various interpretations assigned to phenotypic associations with DNA methylation require greater scrutiny than has often been offered in many epigenome-wide association studies (EWAS) reports

Read more

Summary

Introduction

The impact of cell-composition effects in analysis of DNA methylation data is widely appreciated. DNA methylation profiles measured in blood have been shown to correlate with ovarian cancer [1], bladder cancer [2], cardiovascular disease [3], obesity [4], and environmental exposures [5,6,7] These associations have led to an interest in epigenome-wide association studies (EWAS), which aim to investigate associations between. Normal tissue development, individual cellular differentiation and cellular lineage determination are regulated by epigenetic mechanisms [9] This necessarily means that DNA methylation shows substantial variation across tissue types [11] as well as individual cell types, demonstrated clearly amongst the distinct types of leukocytes [8].

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.