Transfer inhibitory potency prediction to binary classification: A model only needs a small training set

Haowen Dou,Jie Tan,Huiling Wei,Fei Wang,Jinzhu Yang,X.-G Ma,Jiaqi Wang,Teng Zhou

doi:10.1016/j.cmpb.2022.106633

Abstract

One of the most laborious for drug discovery is to select compounds from a library for experimental evaluation. Hence, we propose a machine learning model only needs to be trained on a small dataset to predict the inhibition constant (Ki) and half maximal inhibitory concentration (IC50) for a compound. We transfer the prediction task to a simpler binary classification task based on a naive but effective idea that we only need the related rank of a compound to determine whether to take it for further examination. To achieve this, we design a data augmentation strategy to effectively leverage the relationship between the compounds in the training set. After that, we formulate a new reward function for deep reinforcement learning to balance the feature selection and the accuracy. We employ a particle swarm optimized support vector machine for the binary classification task. Finally, a soft voting mechanism is introduced to solve the contradiction from the binary classification. Sufficient experiments show that our model achieves high and reliable accuracy, and is capable of ranking compounds based on a selected set of molecular descriptors. The current results show that our model provides a potential ligand-based in silico approach for prioritizing chemicals for experimental studies.

Full Text