Abstract

Accurately identifying protein-ATP (Adenosine-5’-triphosphate) binding sites is significant for protein function annotation and new drug invention. Previous studies often utilize classical machine learning classification algorithms to predict protein-ATP binding sites based on protein primary sequence. However, deep learning as a newly developed technique has shown outstanding performance in various fields. In this work, we introduce the deep convolutional neural network for protein-ATP binding sites prediction based on sequence information. Two classification networks are developed including a residual-inception-based predictor and a multi-inception-based predictor, then the ensemble learning is applied by giving optimized weights to each network architecture to combine them for more superior performance. We examine the performance of our proposed method on two groups of independent testing sets including a classic ATP-binding benchmark dataset and a newly proposed ATP-binding dataset. As a result, our proposed method outperforms other state-of-art sequence-based predictors with the AUC of 0.922 and 0.896 respectively which illustrates the efficacy of deep learning technique in protein-ATP binding sites prediction. The source code and benchmark datasets can be downloaded at https://github.com/tlsjz/ATPbinding .

Highlights

  • Interactions between protein and ligands are crucial for a variety of biological process such as membrane transportation [1], muscle contraction [2], replication and transcription of DNA [3], [4]

  • It can be found that barely using PSSM feature, the proposed ensemble predictor achieves the AUC value of 0.883 and 0.909 on ATP-227 and ATP-388 datasets respectively which demonstrates that PSSM feature is significant in protein-ATP binding sites prediction

  • As the predicted solvent accessibility (ASA) feature is not applied in multi-inceptionbased predictor, the performance of ensemble predictor is obtained by combining the prediction results of residualinception-based predictor with feature combination of PSSM+secondary structure (SS)+ASA and multi-inception-based predictor with feature combination of PSSM+SS

Read more

Summary

Introduction

Interactions between protein and ligands are crucial for a variety of biological process such as membrane transportation [1], muscle contraction [2], replication and transcription of DNA [3], [4]. The associate editor coordinating the review of this manuscript and approving it for publication was Yi Zhang. Among these ligands, adenosine-5’-triphosphate (ATP) plays an important role as it is the direct source of energy for most of biological activities from bacteria to mankind [10]. ATP needs to interact with large number of proteins to release the chemical energy which supports the proteins to realize their specific functions [11]. The number of known ATP-binding proteins is still far from enough when facing the large-scale protein sequences in the post-genomic era

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call