Abstract

Molecular toxicity is a critical feature of drug development. It is thus very important to develop computational models to evaluate the toxicity of small molecules. The accuracy of toxicity prediction largely depends on the quality of molecular representation; however, current methods for this purpose do not address this issue well. Here, we introduce a new metric, gap-Δenergy, which is designed to quantify the intermolecular bond energy difference with atom distance. We next find significant variations in the gap-Δenergy distribution among different types of molecules. Moreover, we show that this metric is able to distinguish the toxic small molecules. We collected data sets of toxic and exogenous small molecules and presented a novel index, namely, global toxicity, to evaluate the overall toxicity of molecules. Based on molecular descriptors and the proposed gap-Δenergy metric, we further constructed machine learning models that were trained with 7816 small molecules. The XGBoost-based model achieved the best performance with an AUC score of 0.965 and an F1 score of 0.849 on the test set (1954 small molecules), which outperformed the model that did not use gap-Δenergy features, with a sensitivity score increase of 3.2%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call