Abstract
BackgroundIdentification of differentially methylated regions (DMRs) is the initial step towards the study of DNA methylation-mediated gene regulation. Previous approaches to call DMRs suffer from false prediction, use extreme resources, and/or require library installation and input conversion.ResultsWe developed a new approach called Defiant to identify DMRs. Employing Weighted Welch Expansion (WWE), Defiant showed superior performance to other predictors in the series of benchmarking tests on artificial and real data. Defiant was subsequently used to investigate DNA methylation changes in iron-deficient rat hippocampus. Defiant identified DMRs close to genes associated with neuronal development and plasticity, which were not identified by its competitor. Importantly, Defiant runs between 5 to 479 times faster than currently available software packages. Also, Defiant accepts 10 different input formats widely used for DNA methylation data.ConclusionsDefiant effectively identifies DMRs for whole-genome bisulfite sequencing (WGBS), reduced-representation bisulfite sequencing (RRBS), Tet-assisted bisulfite sequencing (TAB-seq), and HpaII tiny fragment enrichment by ligation-mediated PCR-tag (HELP) assays.
Highlights
Identification of differentially methylated regions (DMRs) is the initial step towards the study of DNA methylation-mediated gene regulation
Defiant is designed to provide easy and fast implementation of DMR calling while guaranteeing the prediction performance
For a rigorous test while weighing DNA methylation based on coverage, we used a Welch’s t-test
Summary
DMR identification by weighted Welch expansion Defiant calculates a p-value using a Welch’s t-test for the weighted means and variance (Method). We compared precision (TP/(TP + FP)) against FN as well as against recall (TP/(TP + FN)) for 8 sets of RRBS and WGBS data (Fig. 3) In these tests, both Defiant and Metilene showed excellent precision and recall with very low FNs compared with other DMR callers. The performances between Defiant and Metilene were comparable when we investigated the statistical differences between the predictions using a Welch’s t-test [24] (p = 0.81, Additional file 1: Table S2). Compared with the artificial datasets which have the distribution of coverage in a short range, the WGBS data in rat hippocampus were with a wider range of coverage (Fig. 7c) These results indicate a superior performance of Defiant when applied to real WGBS data. Compared to the memory usages of the DMR callers, Defiant used slightly more memory than Metilene because Defiant is designed to run in a single step
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.