Abstract

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine-learning classifiers and regressors and to evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to different results, with deep neural networks (DNNs) on user-defined physicochemically relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evaluation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and prediction, improving assessment and design of compounds. Source code and data are available on github.

Highlights

  • Concerns are rising over endocrine disruptors entering the environment and food chain [1,2,3,4]

  • With respect to well-known compounds that are frequently misclassified [20], the results provided here (Table 3) show that four out of 11 compounds correctly predicted compared to three out of 11 reported elsewhere [20], the difference being the correct prediction of finasteride [20]

  • Unbalanced datasets can be transformed to balanced sets by unbicase dropping and perform better in training and evaluation metrics

Read more

Summary

Introduction

Concerns are rising over endocrine disruptors entering the environment and food chain [1,2,3,4]. Androgen receptor pathway modulators are compounds that can have an effect on tumors and reproductive systems [1,2,3,4]. The CoMPARA challenge was a collaborative modeling effort to predict possible AR modulators based on a wide collection of state-of-the-art experimental data [6]. Toxicity modeling of compounds is important in several ways: compounds that are used in pharmaceutical and industrial applications need to be assessed for possible adverse effects on humans and other organisms, as well as being an important development barrier for new drugs and useful compounds [1,2,3,4]. Difficulties include the lack of experimental tests, including chronic and different exposure effects, as well as those of metabolites of compounds [1,2,3,4]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call