Abstract

DNA N4-methylcytosine (4mC) and DNA N6-methyladenine (6mA) are significant epigenetic modifications. 4mC is closely related to the restriction modification system, and 6mA has a hand in the process of various cellular activities. In order to further explore their functional mechanisms and biological significance, and to overcome the bottleneck of narrow coverage in traditional experimental methods, it is needed to propose an efficient prediction method with a wide range of applications. In this work, we develop a prediction method named 4mCi6mA-BGC to predict 4mC sites and 6mA sites. First, we employ binary, K-mer nucleotide frequency (K-mer), pseudo K-tuple nucleotide composition (PseKNC), dinucleotide-based auto covariance (DAC) and monoDiKGap theoretical description (MonoDiKGap) to encode DNA sequences. Then, the elastic net is employed for feature selection, and the optimized feature space is put into a deep learning framework composed of bidirectional gated recurrent unit and convolutional neural network. The benchmark datasets include six datasets, which contain 14 328 4mC sites from different species. The results of 10-fold cross-validation indicate that the prediction accuracy significantly outperforms the existing prediction methods. Meanwhile, use independent datasets Rice and Arabidopsis thaliana to further confirm the predictive ability of 4mCi6mA-BGC. Compared with the existing prediction methods, 4mCi6mA-BGC shows the best prediction performance. These comprehensive results indicate that our method can identify DNA modification sites represented by 4mC and 6mA sites.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.