Abstract

DNA methylation of promoter sequences is a repressive epigenetic mark that down-regulates gene expression. However, DNA methylation is more prevalent within gene-bodies than seen for promoters, and gene-body methylation has been observed to be positively correlated with gene expression levels. This paradox remains unexplained, and accordingly the role of DNA methylation in gene-bodies is poorly understood. We addressed the presence and role of human gene-body DNA methylation using a meta-analysis of human genome-wide methylation, expression and chromatin data sets. Methylation is associated with transcribed regions as genic sequences have higher levels of methylation than intergenic or promoter sequences. We also find that the relationship between gene-body DNA methylation and expression levels is non-monotonic and bell-shaped. Mid-level expressed genes have the highest levels of gene-body methylation, whereas the most lowly and highly expressed sets of genes both have low levels of methylation. While gene-body methylation can be seen to efficiently repress the initiation of intragenic transcription, the vast majority of methylated sites within genes are not associated with intragenic promoters. In fact, highly expressed genes initiate the most intragenic transcription inconsistent with the previously held notion that gene-body methylation serves to repress spurious intragenic transcription to allow for efficient transcriptional elongation. These observations lead us to propose a model to explain the presence of human gene-body methylation. This model holds that the repression of intragenic transcription by gene-body methylation is largely epiphenomenal, and suggests that gene-body methylation levels are predominantly shaped via the accessibility of the DNA to methylating enzyme complexes.

Highlights

  • DNA methylation is a crucial epigenetic mark with roles in embryogenesis and differentiation [1], X-inactivation [2], imprinting [3] and repression of viral and repeat sequences [4]

  • We made use of four datasets from the ENCODE project: 1) DNA methylation data generated by Representation Bisulfite Sequencing (RRBS)[26], 2) gene expression data generated from human exon microarrays[27, 28], 3) RNA polymerase II (Pol2) binding locations generated by ChIP-Seq [31, 33,34,35,36] and 4) the genomic locations of DNaseI hypersensitive sites (DHSS) generated by the digital DNaseI technique [32]

  • All five of these datasets were available for three cell-lines (GM12878, K562 and HepG2), which together entail the primary focus of the study, and different subsets of the same five datasets were available in three additional cell-lines (HeLa-S3, H1hESC and HUVEC) (Table 1)

Read more

Summary

Introduction

DNA methylation is a crucial epigenetic mark with roles in embryogenesis and differentiation [1], X-inactivation [2], imprinting [3] and repression of viral and repeat sequences [4]. One long established role of DNA methylation in promoter regions is the repression of transcription [1, 8, 9]. DNA methylation in gene bodies is surprisingly abundant and has been reported to show a positive correlation with gene expression [10,11,12,13,14,15] even though it can interfere with transcription elongation [16]. The apparent contradiction between the activities of DNA methylation in promoters versus gene bodies has been referred to as the DNA methylation paradox [17]. We address this paradox in an effort to better understand the presence and role of DNA methylation in human gene bodies

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call