Predicting and investigating cytotoxicity of nanoparticles by translucent machine learning

Hengjie Yu,Zhilin Zhao,Fang Cheng

doi:10.1016/j.chemosphere.2021.130164

Abstract

Safety concerns of engineered nanoparticles (ENPs) hamper their applications and commercialization in many potential fields. Machine learning has been proved as a great tool to understand the complex ENP-organism-environment relationship. However, good-performance machine learning models usually exist as black boxes, which may be difficult to build trust and whose ways of expressing knowledge rarely directly map to forms familiar to scientists. Here, we present an approach for uncovering causal structure in nanotoxicity datasets by mutual-validated and model-agnostic interpretation methods. Model predictions can be explained from feature importance, feature effects, and feature interactions. The utility of this approach is demonstrated through two case studies, the cytotoxicity of cadmium-containing quantum dots and metal oxide nanoparticles. Further, these case studies indicate the efficacy and impacts at two scales: (i) model interpretation, where the most relevant features for correlating cytotoxicity are identified and their influence on model predictions and interactions with other features are then explained, and (ii) model validation, where the difference among interpretation results of different methods (or the difference between interpretation results and well-known toxicity mechanisms) may reflect some inherent problems in the used dataset (or the developed models). Our approach of integrating machine learning models and interpretation methods provides a roadmap for predicting the toxicity of ENPs in a translucent way.

Full Text