On the prediction of DNA-binding proteins only from primary sequences: A deep learning approach.

Yu-Hui Qu,Hua Yu,Xiu-Jun Gong,Jia-Hui Xu,Hong-Shun Lee

doi:10.1371/journal.pone.0188129

Abstract

DNA-binding proteins play pivotal roles in alternative splicing, RNA editing, methylating and many other biological functions for both eukaryotic and prokaryotic proteomes. Predicting the functions of these proteins from primary amino acids sequences is becoming one of the major challenges in functional annotations of genomes. Traditional prediction methods often devote themselves to extracting physiochemical features from sequences but ignoring motif information and location information between motifs. Meanwhile, the small scale of data volumes and large noises in training data result in lower accuracy and reliability of predictions. In this paper, we propose a deep learning based method to identify DNA-binding proteins from primary sequences alone. It utilizes two stages of convolutional neutral network to detect the function domains of protein sequences, and the long short-term memory neural network to identify their long term dependencies, an binary cross entropy to evaluate the quality of the neural networks. When the proposed method is tested with a realistic DNA binding protein dataset, it achieves a prediction accuracy of 94.2% at the Matthew’s correlation coefficient of 0.961. Compared with the LibSVM on the arabidopsis and yeast datasets via independent tests, the accuracy raises by 9% and 4% respectively. Comparative experiments using different feature extraction methods show that our model performs similar accuracy with the best of others, but its values of sensitivity, specificity and AUC increase by 27.83%, 1.31% and 16.21% respectively. Those results suggest that our method is a promising tool for identifying DNA-binding proteins.

Highlights

One vital function of proteins is DNA-binding that play pivotal roles in alternative splicing, RNA editing, methylating and many other biological functions for both eukaryotic and prokaryotic proteomes [1]
The Convolutional neural networks (CNN) layer consists of two convolutional layers, each followed by a max pooling operation
The results show that the prediction accuracies of our model outperform LibSVM nearly by 8% and 4% for Arabidopsis and yeast species respectively

Summary

Introduction

One vital function of proteins is DNA-binding that play pivotal roles in alternative splicing, RNA editing, methylating and many other biological functions for both eukaryotic and prokaryotic proteomes [1]. Both computational and experimental techniques have been developed to identify the DNA binding proteins. Predicting DNA-binding proteins from sequences using a deep learning approach. The specific roles of these authors are articulated in the ‘author contributions’ section

Objectives

Methods

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: Dec 29, 2017
Citations: 50	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

On the prediction of DNA-binding proteins only from primary sequences: A deep learning approach.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Prediction of DNA binding proteins using local features and long-term dependencies with primary sequences based on deep learning.
Guobin Li ... Le Zou
PeerJ | VOL. 9
Guobin Li, et. al.Guobin Li ... Le Zou
03 May 2021
PeerJ | VOL. 9

Predicting DNA binding proteins using support vector machine with hybrid fractal features
Xiao-Hui Niu ... Jing-Bo Xia
Journal of Theoretical Biology | VOL. 343
Xiao-Hui Niu, et. al.Xiao-Hui Niu ... Jing-Bo Xia
01 Nov 2013
Journal of Theoretical Biology | VOL. 343

StackDPPred: a stacking based prediction of DNA-binding protein from sequence.
Avdesh Mishra ... Pujan Pokhrel
Bioinformatics | VOL. 35
Avdesh Mishra, et. al.Avdesh Mishra ... Pujan Pokhrel
19 Jul 2018
Bioinformatics | VOL. 35

Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function
Huiying Zhao ... Yuedong Yang
Bioinformatics | VOL. 26
Huiying Zhao, et. al.Huiying Zhao ... Yuedong Yang
04 Jun 2010
Bioinformatics | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the prediction of DNA-binding proteins only from primary sequences: A deep learning approach.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE