CsDMA: an improved bioinformatics tool for identifying DNA 6\u2009mA modifications via Chou\u2019s 5-step rule

Ze Liu,Wei Jiang,Zili He,Wei Dong

doi:10.1038/s41598-019-49430-4

Ze Liu, Wei Jiang + Show 2 more

Open Access

https://doi.org/10.1038/s41598-019-49430-4

Copy DOI

Abstract

DNA N6-methyldeoxyadenosine (6 mA) modifications were first found more than 60 years ago but were thought to be only widespread in prokaryotes and unicellular eukaryotes. With the development of high-throughput sequencing technology, 6 mA modifications were found in different multicellular eukaryotes by using experimental methods. However, the experimental methods were time-consuming and costly, which makes it is very necessary to develop computational methods instead. In this study, a machine learning-based prediction tool, named csDMA, was developed for predicting 6 mA modifications. Firstly, three feature encoding schemes, Motif, Kmer, and Binary, were used to generate the feature matrix. Secondly, different algorithms were selected into the prediction model and the ExtraTrees model received the best AUC of 0.878 by using 5-fold cross-validation on the training dataset. Besides, the ExtraTrees model also received the best AUC of 0.893 on the independent testing dataset. Finally, we compared our method with state-of-the-art predictors and the results shown that our model achieved better performance than existing tools.

Highlights

DNA N6-methyldeoxyadenosine (6 mA) modifications were first discovered in Bacteria in 19551
The benchmark datasets created in the iDNA6mA-Pseudo K-tuple Nucleotide Composition (PseKNC) and i6mA-Pred predictors were used and different algorithms were implemented to generate the final optimized model. 5-fold cross-validation was performed and the prediction results demonstrated that our model achieved a better performance than existing 6 mA prediction tools
We developed an improved tool, called csDMA, for predicting 6 mA modifications in different species

Summary

Introduction

DNA N6-methyldeoxyadenosine (6 mA) modifications were first discovered in Bacteria in 19551. In 2016, Koziol et al used dot blots, HPLC, and methyl DNA immunoprecipitation followed by sequencing (MeDIP-seq) to detect 6 mA modifications in vertebrates including Xenopus laevis, mouse and human[6]. As the experimental methods are time-consuming and costly, researchers are trying to predict DNA 6 mA modifications by using computational methods. IDNA6mA-PseKNC is the first prediction tool for predicting 6 mA modifications in the Mus musculus genome and i6mA-Pred is the first identification method in the rice genome. The feature extraction and classification methods proposed in these studies provide a valuable basis for the prediction of DNA 6 mA modifications. Sequence identity threshold to develop a prediction tool that can be used to predict DNA 6 mA modifications across species.

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Sep 11, 2019
Citations: 30	License type: open-access

R Discovery Prime

R Discovery Prime

CsDMA: an improved bioinformatics tool for identifying DNA 6\u2009mA modifications via Chou\u2019s 5-step rule

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Improved Prediction of Protein-Protein Interaction Mapping on Homo Sapiens by Using Amino Acid Sequence Features in a Supervised Learning Framework.
Md Merajul Islam ... Md Mehedi Hasan
Protein & Peptide Letters | VOL. 28
Md Merajul Islam, et. al.Md Merajul Islam ... Md Mehedi Hasan
10 Jun 2020
Protein & Peptide Letters | VOL. 28

Renal tumor segmentation, visualization, and segmentation confidence using ensembles of neural networks in patients undergoing surgical resection.
Sophie Bachanek ... Tanja Yani Janssen
European radiology | VOL. -
Sophie Bachanek, et. al.Sophie Bachanek ... Tanja Yani Janssen
23 Aug 2024
European radiology | VOL. -

Machine learning analysis to identify the association between risk factors and onset of nosocomial diarrhea: a retrospective cohort study.
Ken Kurisu ... Kei Ogino
PeerJ | VOL. 7
Ken Kurisu, et. al.Ken Kurisu ... Kei Ogino
30 Oct 2019
PeerJ | VOL. 7

The Application of CT Radiomics in the Diagnosis of Vein Wall Invasion in Patients With Renal Cell Carcinoma Combined With Tumor Thrombus.
Xun Zhao ... Jiangang Liu
The oncologist | VOL. 29
Xun Zhao, et. al.Xun Zhao ... Jiangang Liu
06 Sep 2023
The oncologist | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CsDMA: an improved bioinformatics tool for identifying DNA 6\u2009mA modifications via Chou\u2019s 5-step rule

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports