Generating accurate in silico predictions of acute aquatic toxicity for a range of organic chemicals: Towards similarity-based machine learning methods

Agnieszka Gajewicz-Skretna,Ayako Furuhama,Hiroshi Yamamoto,Noriyuki Suzuki

doi:10.1016/j.chemosphere.2021.130681

Agnieszka Gajewicz-Skretna, Ayako Furuhama + Show 2 more

Open Access

https://doi.org/10.1016/j.chemosphere.2021.130681

Copy DOI

Abstract

There has been an increase in the use of non-animal approaches, such as in silico and/or in vitro methods, for assessing the risks of hazardous chemicals. A number of machine learning algorithms link molecular descriptors that interpret chemical structural properties with their biological activity. These computer-aided methods encounter several challenges, the most significant being the heterogeneity of datasets; more efficient and inclusive computational methods that are able to process large and heterogeneous chemical datasets are needed. In this context, this study verifies the utility of similarity-based machine learning methods in predicting the acute aquatic toxicity of diverse organic chemicals on Daphnia magna and Oryzias latipes. Two similarity-based methods were tested that employ a limited training dataset, most similar to a given fitting point, instead of using the entire dataset that encompasses a wide range of chemicals. The kernel-weighted local polynomial approach had a number of advantages over the distance-weighted k-nearest neighbor (k-NN) algorithm. The results highlight the importance of lipophilicity, electrophilic reactivity, molecular polarizability, and size in determining acute toxicity. The rigorous model validation ensures that this approach is an important tool for estimating toxicity in new or untested chemicals.

Full Text