Abstract

Chemical respiratory toxicity usually causes serious harms to human body, so it is necessary to identify drugs or compounds with potential respiratory toxicity in early drug discovery stage. In this study, we collected 2,529 compounds from public databases and literature, and used six machine learning methods together with nine types of molecular fingerprints to construct a series of binary classification models for prediction of chemical respiratory toxicity. The accuracy of the best performing model was 0.869 for test set, and 0.933 for external validation set. Meanwhile, we defined the applicability domain of the models based on molecular similarity. We also identified the structural alerts about chemical respiratory toxicity through information gain and substructure frequency analysis, which could be used to elucidate their mechanisms and optimize the structures with less toxicity. Our study would be very helpful for prediction of chemical respiratory toxicity in early stage of drug discovery and environmental risk assessment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call