Abstract

Drug–target interactions (DTIs) are an important issue in new drug discovery and drug repositioning techniques. Currently. However, due to the high degree of imbalance and high-dimensional nature of datasets in the field, the design of effective predictive methods faces challenges. In this study, we model DTI prediction as a binary classification problem. First, original positive and negative sample sets are constructed using the databases of LINCS and Drugbank. Then the CGAN model is applied for over-sampling the original positive samples, so that the proportion between the positive and negative samples will be balanced. The above sampled class-balanced samples are used to train the classifier, and finally, the class-unknown samples are used for DTI prediction. In the experimental section, the necessity of over-sampling is demonstrated. Then, comparisons of different over-samplers showed that the CGAN over-sampler had obvious advantages over traditional samplers. Therefore, for high-dimensional and class-imbalanced datasets, CGAN over-sampling is more applicable for DTI prediction than traditional methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call