Abstract
BackgroudDNA methylation is an epigenetic modification that plays important roles on gene regulation. Study of whole-genome bisulfite sequencing and reduced representation bisulfite sequencing brings the availability of DNA methylation at single CpG resolution. The main interest of study on DNA methylation data is to test the methylation difference under two conditions of biological samples. However, the high cost and complexity of this sequencing experiment limits the number of biological replicates, which brings challenges to the development of statistical methods.ResultsBayesian modeling is well known to be able to borrow strength across the genome, and hence is a powerful tool for high-dimensional- low-sample- size data. In order to provide accurate identification of methylation loci, especially for low coverage data, we propose a full Bayesian partition model to detect differentially methylated loci under two conditions of scientific study. Since hypo-methylation and hyper-methylation have distinct biological implication, it is desirable to differentiate these two types of differential methylation. The advantage of our Bayesian model is that it can produce one-step output of each locus being either equal-, hypo- or hyper-methylated locus without further post-hoc analysis. An R package named as MethyBayes implementing the proposed full Bayesian partition model will be submitted to the bioconductor website upon publication of the manuscript.ConclusionsThe proposed full Bayesian partition model outperforms existing methods in terms of power while maintaining a low false discovery rate based on simulation studies and real data analysis including bioinformatics analysis.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0850-3) contains supplementary material, which is available to authorized users.
Highlights
DNA methylation is methylation of cytosine residues at CpG dinucleotides in a DNA sequence and affects 70–80 % of all CpG dinucleotides in mammals [1]
In order to provide accurate identification of methylation loci, especially for low coverage data, we propose a full Bayesian partition model to detect differentially methylated loci under two conditions of scientific study
An R package named as MethyBayes implementing the proposed full Bayesian partition model will be submitted to the bioconductor website upon publication of the manuscript
Summary
DNA methylation is methylation of cytosine residues at CpG dinucleotides in a DNA sequence and affects 70–80 % of all CpG dinucleotides in mammals [1] It is the most widely studied epigenetic modification and is known to have profound effects on gene expression. It is involved in embryogensis, genomic imprinting [2], X-chromosome inactivation [3], and many diseases [4], . Among the methods developed to quantify the (relative) levels of CpG methylation in the whole genome, bisulfite sequencing is a common technique and has its advantages This method involves treating DNA with sodium bisulfite [6], which converts un-methylated cytosines to uracil and leaves methylated cytosines unchanged. It can provide methylation level at a single nucleotide resolution
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.