Abstract

Accurate prediction of binding affinity between protein and ligand is a very important step in the field of drug discovery. Although there are many methods based on different assumptions and rules do exist, prediction performance of protein–ligand binding affinity is not satisfactory so far. This paper proposes a new cascade graph-based convolutional neural network architecture by dealing with non-Euclidean irregular data. We represent the molecule as a graph, and use a simple linear transformation to deal with the sparsity problem of the one-hot encoding of original data. The first stage adopts ARMA graph convolutional neural network to learn the characteristics of atomic space in the protein–ligand complex. In the second stage, one variant of the MPNN graph convolutional neural network is introduced with chemical bond information and interactive atomic features. Finally, the architecture passes through the global add pool and the fully connected layer, and outputs a constant value as the predicted binding affinity. Experiments on the PDBbind v2016 data set showed that our method is better than most of the current methods. Our method is also comparable to the state-of-the-art method on the data set, and is more intuitive and simple.

Highlights

  • The mutual recognition and binding of proteins and ligands occurs in almost all basic biological activities and plays a very important role in these activities

  • With the development of the early classical scoring function and the use of machine learning and deep learning algorithms to build a model for prediction, the accuracy is constantly improving but has still failed to achieve satisfactory results [8,9]

  • We evaluated our proposed method on PDBBind v2016 and CASF-2013

Read more

Summary

Introduction

The mutual recognition and binding of proteins and ligands occurs in almost all basic biological activities and plays a very important role in these activities. Among methods for computing binding affinities between proteins and ligands, biological/chemical experiments and direct calculation based on the first principle of physics or quantum mechanics undoubtedly perform the best [2,3,4]. These methods are so time-consuming and costly that they are not suitable for large-scale molecular screening in the early stages of drug discovery. In large-scale virtual screening, the docking postures of protein and ligand are selected by a molecular docking program and scored by the scoring function. With the development of the early classical scoring function and the use of machine learning and deep learning algorithms to build a model for prediction, the accuracy is constantly improving but has still failed to achieve satisfactory results [8,9]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call