Abstract

Fish is one of the model animals used to evaluate the adverse effects of a chemical exposed to the ecosystem. However, its low throughput and relevantly high expense make it impossible to test all new chemicals in manufacture. Hence, using in silico models to prioritize compounds to be tested has been widely applied in environmental risk assessment and drug discovery. In this study, we constructed the local predictive models for four fish species, including bluegill sunfish, rainbow trout, fathead minnow, and sheepshead minnow, and the global models with all four fish data. A total of 1874 unique compounds with their labels, that is, toxic (LC50 < 10ppm) or nontoxic, were collected from ECOTOX and literature. Both conventional machine learning methods and the deep learning architecture, graph convolutional network (GCN), were used to build predictive models. The classification accuracy of the best local model for each fish species was higher than 0.83. For the global models, two strategies including consistency prediction and probability threshold were adopted to improve the predictive capability at the cost of limiting applicability domain. For 63% of compounds in domain, the accuracy was around 0.97. By comparison of the deep learning and machine learning methods, we found that the single-task GCN showed specific advantages in performance, and multitask GCN showed no advantages over the conventional machine learning methods. The data and models are available on GitHub (https://github.com/ChemPredict/ChemicalAquaticToxicity).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call