Abstract

Purpose: The emergence of new viruses and, consequently, new diseases make the rapid and precise design of new drugs increasingly necessary. With the availability of large databases of proteins and affinity measures, it is possible to build scoring functions for predicting molecular affinity. These functions are fundamental to intelligent drug design. Objective: In this work, we propose a scoring function to predict affinity between two proteins. The method is based on extracting features by transfer learning on sequences represented on pseudo-convolutions. Method: The pseudo-convolutions organize the sequences into base neighborhood distributions. Each distribution is then represented by an image. Two proteins are then transformed into two images that are concatenated together, forming the third image. Through deep transfer learning, this resulting image is then represented by a vector of attributes, which have dimensionality reduced by Random Forest. Finally, the vector of attributes reduced is applied to a regression learning machine that returns the degree of affinity of the two proteins. Results: We used the Affinity Benchmark Version 2 database. 145 complexes were used for model training and 35 for testing. The results showed a performance equal to or better than the state-of-the-art methods of evaluating protein affinity, considering the correlation coefficients of Pearson, Spearman and Kendall. The best results were 0.66, 0.70, and 0.52. Conclusion: The proposed method can characterize protein sequences so that the binding affinity between two proteins can be estimated without simulating the three-dimensional structure of the complex.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call