Abstract

To minimize the damage from contaminant accidents in rivers, early identification of the contaminant source is crucial. Thus, in this study, a framework combining Machine Learning (ML) and the Transient Storage zone Model (TSM) was developed to predict the spill location and mass of a contaminant source. The TSM model was employed to simulate non-Fickian Breakthrough Curves (BTCs), which entails relevant information of the contaminant source. Then, the ML models were used to identify the BTC features, characterized by 21 variables, to predict the spill location and mass. The proposed framework was applied to the Gam Creek, South Korea, in which two tracer tests were conducted. In this study, six ML methods were applied for the prediction of spill location and mass, while the most relevant BTC features were selected by Recursive Feature Elimination Cross-Validation (RFECV). Model applications to field data showed that the ensemble Decision tree models, Random Forest (RF) and Xgboost (XGB), were the most efficient and feasible in predicting the contaminant source.

Highlights

  • When accidental spills of contaminant occur in natural rivers, a rapid response is necessary to minimize the damage to both aquatic life and humans who depend on the river as a water resource

  • We focused on the optimal Breakthrough Curves (BTCs) features and Machine Learning (ML) models to predict the spill location and spill mass

  • The RMSE is the square root of Mean Absolute Error (MAE), which has consistent units of target variables

Read more

Summary

Introduction

When accidental spills of contaminant occur in natural rivers, a rapid response is necessary to minimize the damage to both aquatic life and humans who depend on the river as a water resource. Zhang and Xin [17] used the basic Genetic Algorithm (GA) to identify the spill location and spill mass of contaminant sources in a small straight river These optimization approaches have limitations of high uncertainties in their deterministic processes and the data used in the optimization [18]. They evaluated the proposed method regarding noise, and validated the model with the data from the real dye tracer test performed in the natural river, which is a significant process to test field applicability. Diffusion process contains many problems of spatial and temporal scale For this reason, data-driven approaches using contaminant spill scenarios to identify the location of the contaminant source were recently presented. The proposed models were applied to the field tracer data obtained in the river in order to ascertain the field applicability

Methodology
CAS Simulation
DT-Based
SVM and Ridge Regression
Feature Importance and Feature Selection
Modeling Performance Criteria
Study Site and Field Tracer Test
Figure
June 2020
Chemical Accident Scenarios in Gam Creek
Model Development
BTC Feature Importance for Inverse Tracking the Contaminant Source
Method
Field Application of ITM
Field Test of Spill Location Predictors
Field of Spill
Field Test of Spill Mass Predictors
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call