DeepBindPoc: a deep learning method to rank ligand binding pockets using molecular vector representation.

Haiping Zhang,Yanjie Wei,Jinzhi Lin,Jiaxiu Zhou,Justin Tze-Yang Ng,Konda Mani Saravanan,Linbu Liao

doi:10.7717/peerj.8864

Abstract

Accurate identification of ligand-binding pockets in a protein is important for structure-based drug design. In recent years, several deep learning models were developed to learn important physical–chemical and spatial information to predict ligand-binding pockets in a protein. However, ranking the native ligand binding pockets from a pool of predicted pockets is still a hard task for computational molecular biologists using a single web-based tool. Hence, we believe, by using closer to real application data set as training and by providing ligand information, an enhanced model to identify accurate pockets can be obtained. In this article, we propose a new deep learning method called DeepBindPoc for identifying and ranking ligand-binding pockets in proteins. The model is built by using information about the binding pocket and associated ligand. We take advantage of the mol2vec tool to represent both the given ligand and pocket as vectors to construct a densely fully connected layer model. During the training, important features for pocket-ligand binding are automatically extracted and high-level information is preserved appropriately. DeepBindPoc demonstrated a strong complementary advantage for the detection of native-like pockets when combined with traditional popular methods, such as fpocket and P2Rank. The proposed method is extensively tested and validated with standard procedures on multiple datasets, including a dataset with G-protein Coupled receptors. The systematic testing and validation of our method suggest that DeepBindPoc is a valuable tool to rank near-native pockets for theoretically modeled protein with unknown experimental active site but have known ligand. The DeepBindPoc model described in this article is available at GitHub (https://github.com/haiping1010/DeepBindPoc) and the webserver is available at (http://cbblab.siat.ac.cn/DeepBindPoc/index.php).

Highlights

A protein can interact with binding partners such as small molecules, nucleic acids or with other proteins in the cell to perform its different important biological functions
The basic idea of mol2vec is to consider the SMILES string as molecular sentence which are composed of words, and like the natural language processing method word2vec, an unsupervised machine learning method was used to construct the mol2vec by learning vector of each word based on a large amount of available chemical compounds dataset (Krallinger et al, 2015)
DeepBindPoc performance on the training, validation and testing datasets To determine the hyperparameter of epoch number, we check the convergences by monitoring the change of accuracy and loss value in both the training and validation process with the increasing epoch number

Summary

Introduction

A protein can interact with binding partners such as small molecules, nucleic acids or with other proteins in the cell to perform its different important biological functions. Understanding how and where these molecules bind in the protein targets provides valuable information for therapeutic design because it is essential to mimic or enhance a function in the cell (Lionta et al, 2014). Predicting ligand binding pockets in proteins is one of the key issues in the early stages of structure-based drug discovery and still an unresolved problem in computer-aided drug design (Liang, Edelsbrunner & Woodward, 1998; Miller & Dill, 2008). Concavity, and CASTp are hybrid methods which use similarity searches from existing databases and other geometric indices to identify pockets (Capra et al, 2009; Le Guilloux, Schmidtke & Tuffery, 2009; Tian et al, 2018)

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PeerJ	Publication Date: Apr 6, 2020
Citations: 13	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

DeepBindPoc: a deep learning method to rank ligand binding pockets using molecular vector representation.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ

Lead the way for us

Similar Papers

Identification of Ligand-Binding Pockets in Proteins Using Residue Preference Methods
Zhijun Qiu ... Xicheng Wang
Protein & Peptide Letters | VOL. 16
Zhijun Qiu, et. al.Zhijun Qiu ... Xicheng Wang
01 Aug 2009
Protein & Peptide Letters | VOL. 16

Implications of the small number of distinct ligand binding pockets in proteins for drug discovery, evolution and biochemical function.
Jeffrey Skolnick ... Hongyi Zhou
Bioorganic & Medicinal Chemistry Letters | VOL. 25
Jeffrey Skolnick, et. al.Jeffrey Skolnick ... Hongyi Zhou
03 Feb 2015
Bioorganic & Medicinal Chemistry Letters | VOL. 25

Identification of the Hydrophobic Ligand Binding Pocket of the S1P1 Receptor
Yuko Fujiwara ... Gabor Tigyi
Journal of Biological Chemistry | VOL. 282
Yuko Fujiwara, et. al.Yuko Fujiwara ... Gabor Tigyi
01 Jan 2007
Journal of Biological Chemistry | VOL. 282

Conformational Complexity and Dynamics in a Muscarinic Receptor Revealed by NMR Spectroscopy.
Jun Xu ... Brian K Kobilka
Molecular Cell | VOL. 75
Jun Xu, et. al.Jun Xu ... Brian K Kobilka
15 May 2019
Molecular Cell | VOL. 75

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DeepBindPoc: a deep learning method to rank ligand binding pockets using molecular vector representation.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PeerJ