Abstract

The computational prediction of interactions between drugs and targets is a standing challenge in drug discovery. State-of-the-art methods for drug-target interaction prediction are primarily based on supervised machine learning with known label information. However, in biomedicine, obtaining labeled training data is an expensive and a laborious process. This paper proposes a semi-supervised generative adversarial networks (GANs)-based method to predict binding affinity. Our method comprises two parts, two GANs for feature extraction and a regression network for prediction. The semi-supervised mechanism allows our model to learn proteins drugs features of both labeled and unlabeled data. We evaluate the performance of our method using multiple public datasets. Experimental results demonstrate that our method achieves competitive performance while utilizing freely available unlabeled data. Our results suggest that utilizing such unlabeled data can considerably help improve performance in various biomedical relation extraction processes, for example, Drug-Target interaction and protein-protein interaction, particularly when only limited labeled data are available in such tasks. To our best knowledge, this is the first semi-supervised GANs-based method to predict binding affinity.

Highlights

  • A basic task in the field of new drug design and development is to model the interaction between known drugs and target proteins and to identify drugs with a high affinity for specific disease proteins (Cheng et al, 2018a; Cheng et al, 2019b)

  • The contributions of this paper mainly include: We proposed a semi-supervised framework for drug-target binding afï- nity (DTA) prediction; we adopted generative adversarial networks (GANs) to extract features of protein sequence and compound SMILES in an unsupervised way

  • To provide a better assessment of our model, we determined the performances of GANsDTA, DeepDTA with two CNN modules and two baseline methods with two different metrics: rm2 index and area under precision recall (AUPR) score as well. rm2

Read more

Summary

INTRODUCTION

A basic task in the field of new drug design and development is to model the interaction between known drugs and target proteins and to identify drugs with a high affinity for specific disease proteins (Cheng et al, 2018a; Cheng et al, 2019b). On one hand, this improves the prediction as more important structural information is exploited, on the other hand, when the 3D structure is the input, these methods depend considerably on the availability of the known 3D structure of the protein-ligand complex Another deep-learning based method, called DeepDTA, was implemented to predict the binding affinities with CNN using only 1D representation, that is, the sequences of the proteins and simplified molecular input line entry system(SMILES)of the compounds. The proposed model can accommodate unlabeled data for the training as feature extractor using GANs does not require labeled data This semi-supervised mechanism enables more datasets even without labels available for our model to learn proteins drugs features, leading to better feature representation and prediction performance . Our results suggest that utilizing such unlabeled data can considerably help improve performance in various biomedical relation extraction processes, when only limited labeled data (e.g. 2000 samples or less) is available in such tasks

MATERIALS AND METHODS
Proposed Method
EXPERIMENTS AND RESULTS
Method
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call