Abstract

BackgroundBinding sites are the pockets of proteins that can bind drugs; the discovery of these pockets is a critical step in drug design. With the help of computers, protein pockets prediction can save manpower and financial resources.ResultsIn this paper, a novel protein descriptor for the prediction of binding sites is proposed. Information on non-bonded interactions in the three-dimensional structure of a protein is captured by a combination of geometry-based and energy-based methods. Moreover, due to the rapid development of deep learning, all binding features are extracted to generate three-dimensional grids that are fed into a convolution neural network. Two datasets were introduced into the experiment. The sc-PDB dataset was used for descriptor extraction and binding site prediction, and the PDBbind dataset was used only for testing and verification of the generalization of the method. The comparison with previous methods shows that the proposed descriptor is effective in predicting the binding sites.ConclusionsA new protein descriptor is proposed for the prediction of the drug binding sites of proteins. This method combines the three-dimensional structure of a protein and non-bonded interactions with small molecules to involve important factors influencing the formation of binding site. Analysis of the experiments indicates that the descriptor is robust for site prediction.

Highlights

  • Binding sites are the pockets of proteins that can bind drugs; the discovery of these pockets is a critical step in drug design

  • If a protein has more than three predicted sites, we identify three predicted binding sites with the highest scores, and the predicted site closest to the actual site is selected from the three sites as the site prediction and is used for evaluation; this approach is called as Top3 prediction

  • Construction of appropriate protein descriptor is the first and foremost problem to be solved while using deep learning, especially the convolutional neural network, to predict the drug binding sites in a protein

Read more

Summary

Methods

Construction of the descriptor It is obvious that the geometry-based approaches take protein shape into consideration and look for gaps or cavities on the surface of a protein. The energy-based methods take into account the potential energy factors influencing the formation of the binding sites, such as hydrogen bonds, van der Waals forces and electric potential energy. The input of the network needs to be changed from a 2D image to a 3D grid. Based on these considerations, various factors influencing the formation of a protein pocket are introduced in the proposed descriptor. Geometry and energy-based methods are combined to construct a multi-channel protein descriptor utilizing the grid voxel. The constructed protein descriptor is a 4-channel grid, which consists of a shape channel, a van der Waals potential energy channel, a hydrogen bond potential energy channel, and an electric potential energy channel

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call