Abstract

Deep learning is an effective method to capture drug-target binding affinity, but low accuracy is still an obstacle to be overcome. Thus, we propose a novel predictor for drug-target binding affinity based on dipeptide frequency of word frequency encoding and a hybrid graph convolutional network. Word frequency characteristics of natural language are used to improve the frequency characteristics of peptides to express target proteins. For each drug molecules, the five different features of drug atoms and the atomic bond relationships are expressed as graphs. The obtained protein features and graph structure are used as the input of convolution neural network and the input of graph convolution neural network, respectively. A prediction model is established to predict the drug affinity by calculating the hidden relationship. In the KIBA data set test experiment, the consistency coefficient of the model is 0.901, which is 0.01 higher than the existing model, and the MSE (mean square error) of the model is 0.126, which is 5% lower than the existing model. In Davis data set test experiment, the consistency coefficient of the model is 0.895, which is 0.006 higher than the existing model, and the MSE of the model is 0.220, which is 4% lower than the existing model. These results show that our proposed method can not only predict the affinity better than those existing models, but also outperform unitary deep learning approaches.

Highlights

  • The discovery processes of the new drug are time consuming, and cost expensively (Roses, 2008)

  • The evaluation indicators are consistent with WideDTA and GraphDTA. the performance of the predicted models of output continuous values is evaluated by Concordance Index (CI), the formula is as follows

  • Our method outperformed the GraphDTA model using the same graph convolutional neural network, the MSE decreased by 5% (0.007) and the CI increased by 0.01 in the KIBA data set, the MSE decreased by 4% (0.009) and the CI increased by 0.006 in the Davis data set

Read more

Summary

INTRODUCTION

The discovery processes of the new drug are time consuming, and cost expensively (Roses, 2008). We calculate (pKd) value (as shown in formula 1) regarding the Davis data set use literature processing method to show We use the improved four types of graph convolutional neural networks by GraphDTA to discover potential relationships for the graph structure of drug features, which are GCN (Kipf and Welling, 2017), GAT (Velickovicet al., 2018), GIN (Xu et al, 2019), GAT-GCN (Nguyen and Venkatesh, 2019). The linear connected layer that the inputs are results of graph convolutional neural networks maps to a 128-dimensional features vectors, which is consistent with the size of feature vectors for protein. The result of the convolution calculation is input to the fully connected layer for mapping to 256 neurons, keeping the size of the drug, and protein consistent. Set the batch size to 512 and the learning rate to 0.00005

RESULTS AND DISCUSSION
CONCLUSION
DATA AVAILABILITY STATEMENT

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.