Abstract

In the course of sample preparation for Next Generation Sequencing (NGS), DNA is fragmented by various methods. Fragmentation shows a persistent bias with regard to the cleavage rates of various dinucleotides. With the exception of CpG dinucleotides the previously described biases were consistent with results of the DNA cleavage in solution. Here we computed cleavage rates of all dinucleotides including the methylated CpG and unmethylated CpG dinucleotides using data of the Whole Genome Sequencing datasets of the 1000 Genomes project. We found that the cleavage rate of CpG is significantly higher for the methylated CpG dinucleotides. Using this information, we developed a classifier for distinguishing cancer and healthy tissues based on their CpG islands statuses of the fragmentation. A simple Support Vector Machine classifier based on this algorithm shows an accuracy of 84%. The proposed method allows the detection of epigenetic markers purely based on mechanochemical DNA fragmentation, which can be detected by a simple analysis of the NGS sequencing data.

Highlights

  • DNA methylation level of CpG islands, genomic sequences with a high occurrence of methylated CpG dinucleotides, is an important regulator of gene expression

  • On the basis of the bisulfite-sequencing data obtained from NGSmethDB8, we identified the methylation status for all CpG dinucleotides in lymphoblastoid cell line

  • On the basis of 5'-read coordinates of each read in each dataset, cleavage rates for all dinucleotides were computed in the following way: r(XY ) = n(XY), N ∗ p(XY)

Read more

Summary

Introduction

DNA methylation level of CpG islands, genomic sequences with a high occurrence of methylated CpG dinucleotides, is an important regulator of gene expression. The only discrepancy is in the cleavage rate of CpG dinucleotides We assume that this result can be explained by methylation of cytosines[2]. In this work the cleavage rates for methylated and unmethylated CpG dinucleotides of human genome were estimated. We found that the cleavage rate for methylated CpG dinucleotides is about 1.5 times higher than that for unmethylated ones. On the basis of our results, the criterion can be developed, by which the NGS data can be used for defining the total level of the CpG methylation in the given cell type without any additional experiments

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.