Abstract

ABSTRACT One step towards reduced animal testing is the use of in silico screening methods to predict toxicity of chemicals, which requires high-quality data to develop models that are reliable and clearly interpretable. We compiled a large data set of fish early life stage no observed effect concentration endpoints (FELS NOEC) based on published data sources and internal studies, containing data for 338 molecules. Furthermore, we developed a new quantitative structure-activity-activity relationship (QSAAR) model to inform estimation of this endpoint using a combination of dimensionality reduction, regularization, and domain knowledge. In particular, we made use of a sparse partial least squares algorithm (sPLS) to select relevant variables from a huge number of molecular descriptors ranging from topological to quantum chemical properties. The final QSAAR model is of low complexity, consisting of 2 latent variables based on 8 molecular descriptors and experimental Daphnia magna acute data (EC50, 48 h). We provide a mechanistic interpretation of each model parameter. The model performs well, with a coefficient of determination r 2 of 0.723 on the training set (cross-validated q 2 = 0.686) and comparable predictivity on a test data set of chemically related molecules with experimental Daphnia magna data (r 2 test = 0.687, RMSE = 0.793 log units).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call