Abstract
BackgroundComprehensive genome-wide DNA methylation profiling is critical to gain insights into epigenetic reprogramming during development and disease processes. Among the different genome-wide DNA methylation technologies, whole genome bisulphite sequencing (WGBS) is considered the gold standard for assaying genome-wide DNA methylation at single base resolution. However, the high sequencing cost to achieve the optimal depth of coverage limits its application in both basic and clinical research. To achieve 15× coverage of the human methylome, using WGBS, requires approximately three lanes of 100-bp-paired-end Illumina HiSeq 2500 sequencing. It is important, therefore, for advances in sequencing technologies to be developed to enable cost-effective high-coverage sequencing.ResultsIn this study, we provide an optimised WGBS methodology, from library preparation to sequencing and data processing, to enable 16–20× genome-wide coverage per single lane of HiSeq X Ten, HCS 3.3.76. To process and analyse the data, we developed a WGBS pipeline (METH10X) that is fast and can call SNPs. We performed WGBS on both high-quality intact DNA and degraded DNA from formalin-fixed paraffin-embedded tissue. First, we compared different library preparation methods on the HiSeq 2500 platform to identify the best method for sequencing on the HiSeq X Ten. Second, we optimised the PhiX and genome spike-ins to achieve higher quality and coverage of WGBS data on the HiSeq X Ten. Third, we performed integrated whole genome sequencing (WGS) and WGBS of the same DNA sample in a single lane of HiSeq X Ten to improve data output. Finally, we compared methylation data from the HiSeq 2500 and HiSeq X Ten and found high concordance (Pearson r > 0.9×).ConclusionsTogether we provide a systematic, efficient and complete approach to perform and analyse WGBS on the HiSeq X Ten. Our protocol allows for large-scale WGBS studies at reasonable processing time and cost on the HiSeq X Ten platform.
Highlights
Comprehensive genome-wide DNA methylation profiling is critical to gain insights into epigenetic reprogramming during development and disease processes
We compared two pre-bisulphite library preparation methods, where adaptor tagging and ligation are performed before bisulphite conversion, and three postbisulphite methods, where adaptor tagging and ligation are performed after bisulphite conversion [33] (‘Methods’ section)
We found that ~ 95–96% of the single nucleotide polymorphism (SNP) detected in the spike-in whole genome sequencing (WGS) data (Additional file 1: Table 2; Additional file 2: Fig. 3) and ~ 55–57% of SNPs from whole genome bisulphite sequencing (WGBS) data to be concordant with the WGS-GS data (Additional file 1: Table 3; Additional file 2: Fig. 3) indicating a higher degree of false positives called in the WGBS data, as previously reported [34]
Summary
Comprehensive genome-wide DNA methylation profiling is critical to gain insights into epigenetic reprogramming during development and disease processes. To achieve 15× coverage of the human methylome, using WGBS, requires approximately three lanes of 100-bp-paired-end Illumina HiSeq 2500 sequencing. It is impor‐ tant, for advances in sequencing technologies to be developed to enable cost-effective high-coverage sequencing. Alterations in DNA methylation patterns are associated with various human diseases, including cancer and diabetes [3,4,5]. CpG islands are predominately located at gene promoters, and these genes are typically expressed and include almost all the housekeeping genes present in the human genome [7, 8]. CpG island promoters are prone to hypermethylation and associated genes silencing. In contrast the bulk of the genome in cancer is subject to hypomethylation and gene activation of cancer-associated oncogenes [9, 10]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.