Abstract

CpG islands (CGIs) are vertebrate genomic landmarks that encompass the promoters of most genes and often lack DNA methylation. Querying their apparent importance, the number of CGIs is reported to vary widely in different species and many do not co-localise with annotated promoters. We set out to quantify the number of CGIs in mouse and human genomes using CXXC Affinity Purification plus deep sequencing (CAP-seq). We also asked whether CGIs not associated with annotated transcripts share properties with those at known promoters. We found that, contrary to previous estimates, CGI abundance in humans and mice is very similar and many are at conserved locations relative to genes. In each species CpG density correlates positively with the degree of H3K4 trimethylation, supporting the hypothesis that these two properties are mechanistically interdependent. Approximately half of mammalian CGIs (>10,000) are “orphans” that are not associated with annotated promoters. Many orphan CGIs show evidence of transcriptional initiation and dynamic expression during development. Unlike CGIs at known promoters, orphan CGIs are frequently subject to DNA methylation during development, and this is accompanied by loss of their active promoter features. In colorectal tumors, however, orphan CGIs are not preferentially methylated, suggesting that cancer does not recapitulate a developmental program. Human and mouse genomes have similar numbers of CGIs, over half of which are remote from known promoters. Orphan CGIs nevertheless have the characteristics of functional promoters, though they are much more likely than promoter CGIs to become methylated during development and hence lose these properties. The data indicate that orphan CGIs correspond to previously undetected promoters whose transcriptional activity may play a functional role during development.

Highlights

  • In the decade since the human genome sequence was published [1,2], annotation of its landmarks and functional domains has been a priority

  • To check this observation we comprehensively mapped all human and mouse CpG islands (CGIs) using CXXC Affinity Purification (CAP) to enrich for DNA fragments containing clusters of unmethylated CpGs, in conjunction with high throughput sequencing (CAP-seq) [7]

  • Initial CXXC Affinity Purification plus deep sequencing (CAP-seq) analysis appeared to confirm the lower number of CGIs in mice, but closer examination of syntenic chromosomal regions indicated that the mouse harboured CpGrich regions that were not efficiently recovered under our CAP conditions (Figure S1A)

Read more

Summary

Introduction

In the decade since the human genome sequence was published [1,2], annotation of its landmarks and functional domains has been a priority. There are DNA sequence categories of likely functional importance, including non-coding transcription units, conserved elements and regions of variant base composition, whose biological significance is not well understood. Into the latter category fall CpG islands (CGIs), which comprise about 1% of the genome and display an elevated G+C base composition spanning approximately 1000 base pairs. Clustering of unmethylated CpGs has allowed the CGIs to be biochemically isolated as a relatively homogeneous fraction of DNA [3,7] or chromatin [8]

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.