Abstract
To minimize the damage from contaminant accidents in rivers, early identification of the contaminant source is crucial. Thus, in this study, a framework combining Machine Learning (ML) and the Transient Storage zone Model (TSM) was developed to predict the spill location and mass of a contaminant source. The TSM model was employed to simulate non-Fickian Breakthrough Curves (BTCs), which entails relevant information of the contaminant source. Then, the ML models were used to identify the BTC features, characterized by 21 variables, to predict the spill location and mass. The proposed framework was applied to the Gam Creek, South Korea, in which two tracer tests were conducted. In this study, six ML methods were applied for the prediction of spill location and mass, while the most relevant BTC features were selected by Recursive Feature Elimination Cross-Validation (RFECV). Model applications to field data showed that the ensemble Decision tree models, Random Forest (RF) and Xgboost (XGB), were the most efficient and feasible in predicting the contaminant source.
Highlights
When accidental spills of contaminant occur in natural rivers, a rapid response is necessary to minimize the damage to both aquatic life and humans who depend on the river as a water resource
We focused on the optimal Breakthrough Curves (BTCs) features and Machine Learning (ML) models to predict the spill location and spill mass
The RMSE is the square root of Mean Absolute Error (MAE), which has consistent units of target variables
Summary
When accidental spills of contaminant occur in natural rivers, a rapid response is necessary to minimize the damage to both aquatic life and humans who depend on the river as a water resource. Zhang and Xin [17] used the basic Genetic Algorithm (GA) to identify the spill location and spill mass of contaminant sources in a small straight river These optimization approaches have limitations of high uncertainties in their deterministic processes and the data used in the optimization [18]. They evaluated the proposed method regarding noise, and validated the model with the data from the real dye tracer test performed in the natural river, which is a significant process to test field applicability. Diffusion process contains many problems of spatial and temporal scale For this reason, data-driven approaches using contaminant spill scenarios to identify the location of the contaminant source were recently presented. The proposed models were applied to the field tracer data obtained in the river in order to ascertain the field applicability
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have