In the field of microplastics (MPs) toxicity prediction, machine learning (ML) computer simulation techniques are showing great potential. In this study, six ML algorithms were utilized to predict the toxicity of MPs on BEAS-2B cells based on quantitative structure-activity relationship (QSAR) models. Comparing the models of different algorithms, the extreme gradient boosting model showed the best fit and prediction performance (R2tra = 0.9876, R2test = 0.9286). Additionally, Williams plot analysis showed that the six models developed were able to predict stably within their applicability domain, with few outliers. Finally, the three feature importance methods—Embedded Feature Importance (EFI), Recursive Feature Elimination (RFE), and SHapley Additive exPlanations (SHAP)—consistently identified particle size as the most critical feature affecting toxicity prediction. The proposed QSAR model can be utilized for preliminary environmental exposure assessments of MPs and to better understand the associated health risks.
Read full abstract