Abstract

Identification of drug-protein interactions plays an important role in drug discovery. Development of new calculation methods, which have high accuracy solve the problems related to the previous methods, which were expensive and time-consuming. In this article, a new model for drug-protein interactions, and a new mapping approach to represent drug-protein sequences are proposed. The proposed model consists of four parts: drug and protein descriptor section, Drug CNN and Protein CNN sections, Encoder section and classification section. In this method, first the data is prepared. At this stage, the totals are equal to each other. Then in the next step using the k-mers method and Chaos Game, the sequence of drug and protein becomes an image. In the next step, the image is used to train CNN models. These images serve as the input of independent networks for the drug and are considered as a protein. These networks are used to extract feature from drug and protein. In the last layer of these networks, features extracted from drug and protein sequences combine with each other. After concatenating, the number of features will raise. To reduce the number of features and to extract more efficient features, a Variational Autoencoder is used. In the last step, this combined feature vector is used to train machine learning models. The proposed method has been tested and evaluated on 6 standard data sets. The results of the experiments show that the proposed method has an acceptable performance compared to other methods in this data set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call