Advanced suspect and non-target screening (SNTS) approach can identify a large number of potential hazardous micropollutants in groundwater, underscoring the need for pinpointing priority pollutants among detected chemicals. This present study therefore demonstrates a novel multi-criteria decision making (MCDM) framework utilizing machine learning (ML) algorithms coupled with toxicological prioritization index tool (i.e., ml_ToxPi) to rank 251 chemicals of interest in groundwater for subsequent targeted analysis. The MCDM framework integrated chemical analysis data (i.e., peak area and detection frequency), toxicity profiles (i.e., bioactivity ratio, human exposure metadata, and carcinogenicity metadata), as well as the environmental fate and transport information (i.e., octanol-water partition coefficient (log Kow), water solubility, biodegradation half-life, and soil adsorption coefficient (Koc)) for ranking the identified pollutants, and the random forest machine learning model was useful for systematically determining the weighting factors of each variable according to their variable importance scores (R2 = 0.808 and 0.778 for training and testing datasets, respectively, while RMSE = 0.042 in both cases). A total of 47 unique high priority compounds (i.e., ml_ToxPi score ≥ 0.55) were identified across the investigated sampling regions, which constituted diverse groups of compounds classified according to their chemical uses, such as alkylated polycyclic aromatic hydrocarbons (alkyl-PAHs), organophosphate flame retardants (OPFRs), parent PAHs, personal care products (PCPs), pesticides, pharmaceuticals, phenols, plasticizers, transformation product (TPs), and other industrial use chemicals. By incorporating relevant variables into the proposed ML-optimized ToxPi MCDM framework, the prioritization approach described here may be adopted in future SNTS assessment of environmental and biological media.
Read full abstract