Ensuring the safety and efficacy of chemical compounds is crucial in small-molecule drug development. In the later stages of drug development, toxic compounds pose a significant challenge, losing valuable resources and time. Early and accurate prediction of compound toxicity using deep learning models offers a promising solution to mitigate these risks during drug discovery. In this study, we present the development of several deep-learning models aimed at evaluating different types of compound toxicity, including acute toxicity, carcinogenicity, hERG_cardiotoxicity (the human ether-a-go-go related gene caused cardiotoxicity), hepatotoxicity, and mutagenicity. To address the inherent variations in data size, label type, and distribution across different types of toxicity, we employed diverse training strategies. Our first approach involved utilizing a graph convolutional network (GCN) regression model to predict acute toxicity, which achieved notable performance with Pearson R 0.76, 0.74, and 0.65 for intraperitoneal, intravenous, and oral administration routes, respectively. Furthermore, we trained multiple GCN binary classification models, each tailored to a specific type of toxicity. These models exhibited high area under the curve (AUC) scores, with an impressive AUC of 0.69, 0.77, 0.88, and 0.79 for predicting carcinogenicity, hERG_cardiotoxicity, mutagenicity, and hepatotoxicity, respectively. Additionally, we have used the approved drug dataset to determine the appropriate threshold value for the prediction score in model usage. We integrated these models into a virtual screening pipeline to assess their effectiveness in identifying potential low-toxicity drug candidates. Our findings indicate that this deep learning approach has the potential to significantly reduce the cost and risk associated with drug development by expediting the selection of compounds with low toxicity profiles. Therefore, the models developed in this study hold promise as critical tools for early drug candidate screening and selection.
Read full abstract