Abstract

The accurate identification of chemicals with ocular toxicity is of paramount importance in health hazard assessment. In contemporary chemical toxicology, there is a growing emphasis on refining, reducing, and replacing animal testing in safety evaluations. Therefore, the development of robust computational tools is crucial for regulatory applications. The performance of predictive models is heavily reliant on the quality and quantity of data. In this investigation, we amalgamated the most extensive dataset (4901 compounds) sourced from governmental GHS-compliant databases and literature to develop binary classification models of chemical ocular toxicity. We employed 12 molecular representations in conjunction with six machine learning algorithms and two deep learning algorithms to create a series of binary classification models. The findings indicated that the deep learning method GCN outperformed the machine learning models in cross-validation, achieving an impressive AUC of 0.915. However, the top-performing machine learning model (RF-Descriptor) demonstrated excellent performance with an AUC of 0.869 on the test set and was therefore selected as the best model. To enhance model interpretability, we conducted the SHAP method and attention weights analysis. The two approaches offered visual depictions of the relevance of key descriptors and substructures in predicting ocular toxicity of chemicals. Thus, we successfully struck a delicate balance between data quality and model interpretability, rendering our model valuable for predicting and comprehending potential ocular-toxic compounds in the early stages of drug discovery.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call