Abstract
Studies in epigenetics have shown that DNA methylation is a key factor in regulating gene expression. Aberrant DNA methylation is often associated with DNA instability, which could lead to development of diseases such as cancer. DNA methylation typically occurs in CpG context. When located in a gene promoter, DNA methylation often acts to repress transcription and gene expression. The most commonly used technology of studying DNA methylation is bisulfite sequencing (BS-seq), which can be used to measure genomewide methylation levels on the single-nucleotide scale. Notably, BS-seq can also be combined with enrichment strategies, such as reduced representation bisulfite sequencing (RRBS), to target CpG-rich regions in order to save per-sample costs.A typical DNA methylation analysis involves identifying differentially methylated regions (DMRs) between different experimental conditions. Many statistical methods have been developed for finding DMRs in BS-seq data. In this workflow, we propose a novel approach of detecting DMRs using edgeR.By providing a complete analysis of RRBS profiles of epithelial populations in the mouse mammary gland, we will demonstrate that differential methylation analyses can be fit into the existing pipelines specifically designed for RNA-seq differential expression studies. In addition, the edgeRgeneralized linear model framework offers great flexibilities for complex experimental design, while still accounting for the biological variability. The analysis approach illustrated in this article can be applied to any BS-seq data that includes some replication, but it is especially appropriate for RRBS data with small numbers of biological replicates.
Highlights
Studies in the past have shown that DNA methylation, as an important epigenetic factor, plays a vital role in genomic imprinting, X-chromosome inactivation and regulation of gene expression1
Aberrant DNA methylation is often correlated with DNA instability, which leads to development of diseases including imprinting disorders and cancer2,3
DNA methylation almost exclusively occurs at CpG sites, i.e. regions of DNA where a cytosine (C) is linked by a phosphate (p) and bond to a guanine (G) in the nucleotide sequence from 5’ to 3’
Summary
Studies in the past have shown that DNA methylation, as an important epigenetic factor, plays a vital role in genomic imprinting, X-chromosome inactivation and regulation of gene expression. EdgeR is one of the most popular Bioconductor packages for assessing differential expression in RNA-seq data27 It is based on the negative binomial (NB) distribution and it models the variation between biological replicates through the NB dispersion parameter. EdgeR linear models are used to fit the total read count (methylated plus unmethylated) at each genomic locus, in such a way that the proportion of methylated reads at each locus is modelled indirectly as an over-dispersed binomial distribution This approach has a number of advantages. Normalization A key difference between BS-seq and other sequencing data is that the pair of libraries holding the methylated and unmethylated reads for a particular sample are treated as a unit.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.