Abstract

Metal-oxide nanoparticles find widespread applications in mundane life today, and cost-effective evaluation of their cytotoxicity and ecotoxicity is essential for sustainable progress. Machine learning models use existing experimental data and learn quantitative feature–toxicity relationships to yield predictive models. In this work, we adopted a principled approach to this problem by formulating a novel feature space based on intrinsic and extrinsic physicochemical properties, including periodic table properties but exclusive of in vitro characteristics such as cell line, cell type, and assay method. An optimal hypothesis space was developed by applying variance inflation analysis to the correlation structure of the features. Consequent to a stratified train-test split, the training dataset was balanced for the toxic outcomes and a mapping was then achieved from the normalized feature space to the toxicity class using various hyperparameter-tuned machine learning models, namely, logistic regression, random forest, support vector machines, and neural networks. Evaluation on an unseen test set yielded >96% balanced accuracy for the random forest, and neural network with one-hidden-layer models. The obtained cytotoxicity models are parsimonious, with intelligible inputs, and an embedded applicability check. Interpretability investigations of the models identified the key predictor variables of metal-oxide nanoparticle cytotoxicity. Our models could be applied on new, untested oxides, using a majority-voting ensemble classifier, NanoTox, that incorporates the best of the above models. NanoTox is the first open-source nanotoxicology pipeline, freely available under the GNU General Public License (https://github.com/NanoTox).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call