Abstract
BackgroundMethylation of cytosine bases in DNA is a critical epigenetic mark in many eukaryotes and has also been implicated in the development and progression of normal and diseased cells. Therefore, profiling DNA methylation across the genome is vital to understanding the effects of epigenetic. In recent years the Illumina HumanMethylation450 (HM450K) and MethylationEPIC (EPIC) BeadChip have been widely used to profile DNA methylation in human samples. The methods to predict the methylation states of DNA regions based on microarray methylation datasets are critical to enable genome-wide analyses.ResultWe report a computational approach based on the two layers two-state hidden Markov model (HMM) to identify methylation states of single CpG site and DNA regions in HM450K and EPIC BeadChip. Using this mothed, all CpGs detected by HM450K and EPIC in H1-hESC and GM12878 cell lines are identified as un-methylated, middle-methylated and full-methylated states. A large number of DNA regions are segmented into three methylation states as well. Comparing the identified regions with the result from the whole genome bisulfite sequencing (WGBS) datasets segmented by MethySeekR, our method is verified. Genome-wide maps of chromatin states show that methylation state is inversely correlated with active histone marks. Genes regulated by un-methylated regions are expressed and regulated by full-methylated regions are repressed. Our method is illustrated to be useful and robust.ConclusionOur method is valuable for DNA methylation genome-wide analyses. It is focusing on identification of DNA methylation states on microarray methylation datasets. For the features of array datasets, using two layers two-state HMM to identify to methylation states on CpG sites and regions creatively, our method which takes into account the distribution of genome-wide methylation levels is more reasonable than segmentation with a fixed threshold.
Highlights
Methylation of cytosine bases in DNA is a critical epigenetic mark in many eukaryotes and has been implicated in the development and progression of normal and diseased cells
Data description and preprocessing The DNA methylation datasets generated from Illumina Illumina HumanMethylation450 BeadChip (HM450K) array, MethylationEPIC BeadChip (EPIC) array platform and whole genome bisulfite sequencing (WGBS) were downloaded from the Encyclopedia of DNA Elements (ENCODE) project and GEO Datasets Database
In H1-hESC cell line the identified Unmethylated site (UMS) account for 37% which is more than GM12878 (HM450K: 36.74%, EPIC: 31.67%) and the identified Middlemethylated site (MMS) account for 13.45% less than GM12878 (HM450K: 38.93%, EPIC: 41.19%)
Summary
Methylation of cytosine bases in DNA is a critical epigenetic mark in many eukaryotes and has been implicated in the development and progression of normal and diseased cells. Profiling DNA methylation across the genome is vital to understanding the effects of epigenetic. Methylation of DNA cytosine residues at carbon 5 (5meC), a common epigenetic mark in many eukaryotes, is often found in the CpG and CpHpG (H = A, T, C) sequence context. Interactions between transcription factors (TFs) and methylated DNA are considered to play an important role in regulating gene expression [1,2,3,4]. DNA methylation of gene regulatory elements, such as promoters and enhancers, are generally considered to be incompatible with activated gene expression [5, 6]. As one of the popular research areas in gene regulation, DNA methylation is considered to be involved in the pathogenesis of a number of tumors [10]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.