Abstract

N4-methylcytosine as one kind of modification of DNA has a critical role which alters genetic performance such as protein interactions, conformation, stability in DNA as well as the regulation of gene expression same cell developmental and genomic imprinting. Some different 4mC site identifiers have been proposed for various species. Herein, we proposed a computational model, DNC4mC-Deep, including six encoding techniques plus a deep learning model to predict 4mC sites in the genome of F. vesca, R. chinensis, and Cross-species dataset. It was demonstrated by the 10-fold cross-validation test to get superior performance. The DNC4mC-Deep obtained 0.829 and 0.929 of MCC on F. vesca and R. chinensis training dataset, respectively, and 0.814 on cross-species. This means the proposed method outperforms the state-of-the-art predictors at least 0.284 and 0.265 on F. vesca and R. chinensis training dataset in turn. Furthermore, the DNC4mC-Deep achieved 0.635 and 0.565 of MCC on F. vesca and R. chinensis independent dataset, respectively, and 0.562 on cross-species which shows it can achieve the best performance to predict 4mC sites as compared to the state-of-the-art predictor.

Highlights

  • Dynamic DNA modifications, such as methylation and demethylation have an essential role in the regulation of gene expression

  • Namely Dinucleotide composition (DNC), tri-nucleotide composition (TNC), Nucleotide Chemical Property (NCP), binary encoding (BE), NCPNF, and Mutual Information (MMI) were used on various feature encodings for identification of the best classifier for the 4mC site prediction

  • We presented an influential computational model named as DNC4mC-Deep to identify the N4-methylcytosine sites

Read more

Summary

Introduction

Dynamic DNA modifications, such as methylation and demethylation have an essential role in the regulation of gene expression. Several researches have shown that it has the ability to change DNA protein interactions, DNA conformation, DNA stability, and chromatin structure. It can regulate some different functions including cell developmental, genomic imprinting, and gene expressions [3,4]. N4-methylcytosine (4mC), 5-Methylcytosine (5mC), and N6-methyladenine (6mA) as three common methylations by specific methyltransferase enzymes occur in both prokaryotes and eukaryotes [5,6,7]. The host DNA from exogenous pathogenic DNA can be identified by 6mA and

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call