Abstract

DNA methylation has been associated with transcriptional repression and detection of differential methylation is important in understanding the underlying causes of differential gene expression. Bisulfite-converted genomic DNA sequencing is the current gold standard in the field for building genome-wide maps at a base pair resolution of DNA methylation. Here we systematically investigate the underlying features of detecting differential DNA methylation in CpG and non-CpG contexts, considering both the case of mammalian systems and plants. In particular, we introduce DMRcaller, a highly efficient R/Bioconductor package, which implements several methods to detect differentially methylated regions (DMRs) between two samples. Most importantly, we show that different algorithms are required to compute DMRs and the most appropriate algorithm in each case depends on the sequence context and levels of methylation. Furthermore, we show that DMRcaller outperforms other available packages and we propose a new method to select the parameters for this tool and for other available tools. DMRcaller is a comprehensive tool for differential methylation analysis which displays high sensitivity and specificity for the detection of DMRs and performs entire genome wide analysis within a few hours.

Highlights

  • DNA methylation is one of the most common epigenetic modifications that is stably inherited and affects gene regulation [1,2]

  • There are several tools available to detect differentially methylated regions (DMRs) from bisulfite converted DNA sequencing (BS-seq) datasets and whilst some of them are implemented in different programming languages (e.g. [3]) or through a web interface, the majority are provided as R packages

  • We show that the best method to detect DMRs depends on the methylation context (CpG, CpHpG or CpHpH) and DMRcaller implements several methods, which makes this package a comprehensive tool for differential methylation analysis

Read more

Summary

Introduction

DNA methylation is one of the most common epigenetic modifications that is stably inherited and affects gene regulation [1,2]. Given the reduction in cost of DNA sequencing, genome wide bisulfite converted DNA sequencing (BS-seq) has become the method of choice to determine methylation distribution at genomic scale. This approach, despite generating methylation information at single base resolution for theoretically every cytosine in the genome, frequently requires further analysis to efficiently extract methylation information for entire loci, or to highlight methylation differences in regions across different conditions, cell types or genotypes. Due to its statistical power and the available libraries and packages for bioinformatics analysis, R is the programming language of choice for analysing genomic datasets [4]. The most popular R packages used for detecting differential methylated regions include: methylKit [6], bsseq [7], BiSeq [8], methylSig [9], DSS [10], RnBeads [11], methylPipe [12], BEAT [13] and MD3 [14]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call