A modular BiLSTM model architecture for multi-stream water quality data analytics: preserving domain-specificpatterns in ammonium prediction
Accurate prediction of water quality parameters in a water treatment system remains challenging due to the intertwined physical, chemical, and microbiological nature of diverse data streams. We propose a modular bidirectional LSTM architecture that integrates data from water quality sensors, microbial indicators, and weather measurements to handle the prediction complexity, when preserving domain-specific temporal patterns. Evaluated on a specific filtration media in such a water treatment system in Martin County, Florida, our approach achieves significant prediction accuracy for ammonium (NH 4 +) prediction in the effluent over traditional methods, with R 2 = 0.9912 and RMSE = 0.0058. This represents an 85% reduction in mean absolute error compared to standard bi-directional LSTM approaches (from 0.0281 to 0.0041) with no modularized input streams. The architecture’s effectiveness stems from its ability to maintain distinct temporal patterns while enabling integrated prediction, demonstrating strength during periods of parameter volatility from dry to wet season. Results suggest that modular processing of heterogeneous environmental data streams provides a robust foundation for water quality management.
- Research Article
18
- 10.1016/j.aej.2024.04.030
- Apr 25, 2024
- Alexandria Engineering Journal
Water quality assessment using Bi-LSTM and computational fluid dynamics (CFD) techniques
- Research Article
6
- 10.1080/23249676.2021.1978881
- Sep 28, 2021
- Journal of Applied Water Engineering and Research
Analysis of traditional water distribution network (WDN) is more time-consuming and less effective to predict the problem related to water supply systems such as water quality, coagulant dose, and residual chlorine in developing countries. In the present paper water quality neural network, coagulation dose neural network, and residual neural network model were implemented. The performance of the Cascade Feed Forward Neural Network (CFFNN) and Feedforward neural network (FFNN) was excellent for the prediction of water quality parameters and residual chlorine respectively during the training and testing period. CFFNN water quality model (27-30-27) with R = 0.989 produced an excellent prediction of outlet water quality parameters. In coagulant dose modelling, CFFNN (2-40-1) yielded a good prediction with R = 0.947 for a broad range of turbidities as compared to other models. Similarly in residual chlorine modelling, FFNN (2-25-1) delivered the best prediction with R = 0.988 as compared to other models.
- Research Article
46
- 10.1016/j.asej.2023.102510
- Oct 10, 2023
- Ain Shams Engineering Journal
Water pollution threatens human health, agriculture, and ecosystems. Accurate prediction of water quality parameters is crucial for effective protection. We suggest a novel hybrid deep learning model that enhances the efficiency of Support Vector Machines (SVMs) in predicting Electrical Conductivity (EC) and Total Dissolved Solids (TDS). Our model combines Bidirectional Long Short-Term Memory (BILSTM) and SVMs to extract essential features and predict output variables. We evaluated the models using input parameters (PH, Ca++, Mg++, Na+, K+, HCO3, SO4, and Cl) for one, two, and three-day predictions. Employing the Ali Baba and Forty Thieves (AFT) optimization algorithm, we identified optimal input combinations. The BILSTM-SVM model accurately estimated TDS values, with MAPE values of 2%, outperforming other models. Similarly, it successfully predicted EC values, exhibiting an R2 value of 0.94. Our proposed model processes complex relationships and captures crucial features from the data, contributing to improved water quality prediction.
- Research Article
22
- 10.2166/ws.2011.029
- Apr 1, 2011
- Water Supply
Near real-time continuous monitoring systems have been proposed as a promising approach for enhancing drinking water utilities detect and respond efficiently to threats on water distribution systems. Water quality sensors are aimed at revealing contamination intrusions, while hydraulic pressure and flow sensors are utilized for estimating the hydraulic system state. To date optimization models for placing sensors in water distribution systems are targeting separately water quality and hydraulic sensor network goals. Deploying two independent sensor networks within one distribution system is expensive to install and maintain. It might thus be beneficial to consider mutual sensor locations having dual hydraulic and water quality monitoring capabilities (i.e. sensor nodes which collect both hydraulic and water quality data at the same locations). In this study a multi-objective sensor network placement model for conjunctive monitoring of hydraulic and water quality data is developed and demonstrated using the multi-objective non-dominated sorted genetic algorithm NSGA II methodology. Two water distribution systems of increasing complexity are explored showing tradeoffs between hydraulic and water quality sensor location objectives. The proposed method provides a new tool for sensor placements.
- Conference Article
7
- 10.1061/41114(371)323
- May 14, 2010
Land use and cover (LULC) play crucial roles in driving water quantity and quality processes in watersheds. Often changes in LULC have direct effect on water quality of downstream waters. Therefore, developing relationships between LULC and water quality parameters is essential for the evaluation of surface water resources should the LULC change. In this paper we present a methodology based on Artificial Neural Networks (ANN) to predict water quality parameters in ungauged basins; Chlorine (Cl), Sulfate (SO4), Sodium (Na), Potassium (K), Dissolved Organic Carbon (DOC). The model relies on LULC percentages, temperature, and flow discharge as inputs. The approach is tested on 18 watersheds in west Georgia varying in size from 296 to 2659 ha. Total number of data for each parameter is 801 ranging from 15 to 54 from 18 watersheds. Out of 18 watersheds, 12 were selected for training, 3 for validation and 3 for testing the ANNs model. Each set of validation and testing data consists of 1 forested, 1 pastoral, and 1 urban watershed while training data consist of 7 forested, 3 pastoral, and 2 urban watersheds. The model performance was measured with coefficient of determination (R 2 ), Nash- Sutcliffe efficiency coefficient (E), and bias ratio (RB). The model developed using the training data set has successfully predicted the water quality parameters in the independent testing watersheds. The coefficient of determination (R 2 ) in the test watersheds ranged from 0.64 to 0.99 while E ranged from 0.54 to 0.98. Results from this study indicates that if water quality and LULC data are available from multiple watersheds in an area with relatively similar physiographic properties, then one can successfully predict the impact of LULC changes on water quality in any watershed within the same area.
- Research Article
107
- 10.1016/j.envres.2023.115617
- Mar 4, 2023
- Environmental Research
Modeling, challenges, and strategies for understanding impacts of climate extremes (droughts and floods) on water quality in Asia: A review
- Conference Article
2
- 10.1061/9780784413548.057
- May 29, 2014
- World Environmental and Water Resources Congress 2014
Several sophisticated methods have been developed for water quality (WQ) sensor placement in water distribution system analysis, but most of them are geared toward mitigating water security concerns, including but not limited to contaminant detection, chemical intrusion or terroristic attacks. The WQ sensor or logger placement has been less concerned for the water quality monitoring or field data collection in order to conduct WQ model calibration. The sensor locations are conventionally determined in an ad-hoc manner, based on geographic coverage, pipe diameter, pipe material, distance to the source, and accessibility. This paper presents a new methodology for helping engineers to identify the near optimal locations of WQ sensors for WQ model calibration. The approach maximizes the sensory network efficiency, the coverage of the pipes due to wall reaction coefficient adjustments that are the primary model parameters for WQ model calibration. This new method allows us to collect good and sensible data to calibrate a WQ model.
- Conference Article
1
- 10.36334/modsim.2013.l9.hossain
- Dec 1, 2013
Water quality modelling is the primary tool used for catchment and stream water quality investigations. The general architecture of a typical water quality model is the integration of the pollutant processes with the hydrologic and hydraulic approaches. However, due to the lack of specific local information and poor understanding of the limitations of various estimation techniques and underlying physical parameters, modelling approaches are often subjected to producing gross errors. Most of the available water quality models are too simple and/or stochastic in nature. Many of those models perform water quality estimations in isolation, i.e. separate water quality models for catchment and stream analyses. Isolated models may lead to inconsistencies and biased results in the prediction of water quality parameters. On the other hand, there are some integrated water quality models, which are very complex requiring huge physical and chemical data as well as determining many model parameters. This paper presents the development of a simple, integrated and deterministic catchment-stream water quality model to be able to continuously simulate different water quality parameters. The integrated model is comprised of two individual models: the catchment water quality model and the stream water quality model. The catchment water quality model consists of two sub-models: rainfall-runoff model and pollutant processes model. The rainfall-runoff model was developed by considering the time-area method of runoff routing. The model estimates amount of surface runoff generated from a specified catchment for which rainfall data is provided. Water quality parameters were incorporated with the developed rainfall-runoff model, which represents the catchment water quality model. This model estimates the amount of pollutant accumulated on catchment surfaces during the antecedent dry days, and their transportation with surface runoff into waterways and receiving water bodies throughout storm events. Similar to the catchment water quality model, the stream water quality model comprised of two sub-models: the stream flow model and stream pollutant processes model. The stream flow model was developed by considering the Muskingum-Cunge method of stream routing. The stream flow model estimates the rate of water flow into the downstream sections of a particular stream reach. The processes of the same water quality parameters as used in the catchment water quality model were incorporated with the stream flow model which represents the stream water quality model. Final output of the stream water quality model is the concentration of transported pollutants into different downstream sections of a particular stream reach. Finally, the catchment water quality model and the stream water quality model were integrated for the continuous simulation of previously mentioned water quality parameters. For calibration and validation of the model, different published data and reliable source data collected by the Gold Coast City Council (GCCC) were used. Calibration of the catchment water quality model and stream water quality model was performed separately. The calibration results demonstrated the suitability of the developed model as a tool to help with water quality management issues. The major advantage of the developed model is the easy and continuous simulations of water quality parameters associated with surface runoff during any rainfall event. The preparation process of the input data for the model is simple. The capability of the model to simulate surface runoff and pollutant loads from a wide range of rainfall intensities make the integrated model useful in assessing the impact of stormwater pollution flowing into waterways and receiving water bodies and to design effective stormwater treatment measures.
- Research Article
46
- 10.3390/w13131782
- Jun 28, 2021
- Water
Providing an accurate prediction of water quality parameters for improved water quality management is a topical issue in the aquaculture industry. Conventional prediction methods have shown different challenges like a poor generalization, poor prediction accuracy, and high time complexity. Aiming at these challenges, a novel hybrid prediction model with ensemble empirical mode decomposition (EEMD) and deep learning (DL) long-short term memory (LSTM) neural network is proposed in this paper. In this innovative hybrid EEMD-DL-LSTM model, firstly, the integrity of the datasets is enhanced by applying moving average filtering and linear interpolation techniques of water quality parameter datasets pre-treatment. Secondly, the measured real sensor water quality parameters dataset is decomposed with the aid of the EEMD algorithm into disparate IMFs and a corresponding residual item. Thirdly, a multi-feature selection process is applied to make a careful selection of a strongly correlated group of IMFs with the measured real water quality parameter datasets and integrate them as inputs to the DL-LSTM neural network. The presented model is built on water quality sensor data collected from an Abalone farm in South Africa. The performance of the novel hybrid prediction model is validated by comparing the results against the real datasets. To measure the overall accuracy of the novel hybrid prediction model, different statistical indices, namely the Mean Absolute Error (MAE), Mean Square Error (MSE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE), are used.
- Research Article
1
- 10.36108/laujoces/2202.80.0101
- Mar 31, 2022
- LAUTECH Journal of Civil and Environmental Studies
Watershed delineation is a required step when conducting any spatially distributed hydrological modelling. The prediction of water quality parameters in a basin entails delineation of watershed into different number of sub-basins. Thus, this research evaluated effects of watershed delineation on the prediction of water quality parameters in Gaa Akanbi area of Ilorin, Kwara state, Nigeria. The objectives are to model and predict water quality parameters in the watershed; delineate the selected watershed in various numbers of sub-basins and study the effects of watershed delineation in the prediction of water quality parameters of the basin. For proper implementation of this study, Geographical Information System (GIS) software, physically based watershed model – Soil and Water Assessment Tool (SWAT) and other data processing software were used. Since the model is physically based, the surface properties (i.e. Digital Elevation Model (DEM), stream network, digital soil map, digital land use and land cover map, climatic and hydrological data) served as input in the model. The model was daily for a period of 30 years (i.e. January 1991 to December, 2020). The results showed that the watershed was successfully delineated to 3,7,11,21,32,53 sub-basins. Also, it was noted that the predicted values of water quality parameters (Nitrate, organic phosphorus and sediment concentration) are directly proportional to increase in the number of sub-basins delineated in the watershed.
- Research Article
108
- 10.1016/j.proeng.2012.01.1162
- Jan 1, 2012
- Procedia Engineering
Prediction of water quality time series data based on least squares support vector machine
- Research Article
73
- 10.3390/w14182836
- Sep 12, 2022
- Water
Good water quality is important for normal production processes in industrial aquaculture. However, in situ or real-time monitoring is generally not available for many aquacultural systems due to relatively high monitoring costs. Therefore, it is necessary to predict water quality parameters in industrial aquaculture systems to obtain useful information for managing production activities. This study used back propagation neural network (BPNN), radial basis function neural network (RBFNN), support vector machine (SVM), and least squares support vector machine (LSSVM) to simulate and predict water quality parameters including dissolved oxygen (DO), pH, ammonium-nitrogen (NH3-N), nitrate nitrogen (NO3-N), and nitrite-nitrogen (NO2-N). Published data were used to compare the prediction accuracy of different methods. The correlation coefficients of BPNN, RBFNN, SVM, and LSSVM for predicting DO were 0.60, 0.99, 0.99, and 0.99, respectively. The correlation coefficients of BPNN, RBFNN, SVM, and LSSVM for predicting pH were 0.56, 0.84, 0.99, and 0.57. The correlation coefficients of BPNN, RBFNN, SVM, and LSSVM for predicting NH3-N were 0.28, 0.88, 0.99, and 0.25, respectively. The correlation coefficients of BPNN, RBFNN, SVM, and LSSVM for predicting NO3-N were 0.96, 0.87, 0.99, and 0.87, respectively. The correlation coefficients of BPNN, RBFNN, SVM, and LSSVM predicted NO2-N with correlation coefficients of 0.87, 0.08, 0.99, and 0.75, respectively. SVM obtained the most accurate and stable prediction results, and SVM was used for predicting the water quality parameters of industrial aquaculture systems with groundwater as the source water. The results showed that the SVM achieved the best prediction effect with accuracy of 99% for both published data and measured data from a typical industrial aquaculture system. The SVM model is recommended for simulating and predicting the water quality in industrial aquaculture systems.
- Research Article
46
- 10.1007/s11269-016-1280-3
- Mar 7, 2016
- Water Resources Management
Quality of surface water is a serious factor affecting human health and ecological systems. Accurate prediction of water quality parameters plays an important role in the management of rivers. Thus, different methods such as (support vector regression) SVR have been employed to predict water quality parameters. This paper applies SVR to predict eight water quality parameters including (sodium (Na+), potassium (K+), magnesium (Mg+2), sulfates (SO4−2), chloride (Cl−), power of hydrogen (pH), electrical conductivity (EC), and total dissolved solids (TDS)) at the Astane station in Sefidrood River, Iran. To achieve an efficient SVR model, the SVR parameters should be selected carefully. Commonly, various techniques such as trial and error, grid search and metaheuristic algorithms have been applied to estimate these parameters. This study presents a novel tool for estimation of quality parameters by coupling SVR and shuffled frog leaping algorithm (SFLA) . Results of SFLA-SVR compared with genetic programming (GP) as a capable method in water quality prediction. Using SFLA-SVR, average of RMSE for training and testing of six combinations of data sets for all of the water quality parameters improved 57.4 % relative to GP. These results indicate that the new proposed SFLA-SVR tool is more efficient and powerful than GP for determining water quality parameters.
- Research Article
6
- 10.3390/su152416816
- Dec 13, 2023
- Sustainability
Prediction of water quality parameters is a significant aspect of contemporary green development and ecological restoration. However, the conventional water quality prediction models have limited accuracy and poor generalization capability. This study aims to develop a dependable prediction model for ammonia nitrogen concentration in water quality parameters. Based on the characteristics of the long-term dependence of water quality parameters, the unique memory ability of the Long Short-Term Memory (LSTM) neural network was utilized to predict water quality parameters. To improve the accuracy of the LSTM prediction model, the ammonia nitrogen data were decomposed using Empirical Modal Decomposition (EMD), and then the parameters of the LSTM model were optimized using the Improved Whale Optimization Algorithm (IWOA), and a combined prediction model based on EMD-IWOA-LSTM was proposed. The study outcomes demonstrate that EMD-IWOA-LSTM displays improved prediction accuracy with reduced RootMean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) in comparison to the LSTM and IWOA-LSTM approaches. These research findings better enable the monitoring and prediction of water quality parameters, offering a novel approach to preventing water pollution rather than merely treating it afterwards.
- Research Article
80
- 10.3390/w13091273
- Apr 30, 2021
- Water
The current global water environment has been seriously damaged. The prediction of water quality parameters can provide effective reference materials for future water conditions and water quality improvement. In order to further improve the accuracy of water quality prediction and the stability and generalization ability of the model, we propose a new comprehensive deep learning water quality prediction algorithm. Firstly, the water quality data are cleaned and pretreated by isolation forest, the Lagrange interpolation method, sliding window average, and principal component analysis (PCA). Then, one-dimensional residual convolutional neural networks (1-DRCNN) and bi-directional gated recurrent units (BiGRU) are used to extract the potential local features among water quality parameters and integrate information before and after time series. Finally, a full connection layer is used to obtain the final prediction results of total nitrogen (TN), total phosphorus (TP), and potassium permanganate index (COD-Mn). Our prediction experiment was carried out according to the actual water quality data of Daheiting Reservoir, Luanxian Bridge, and Jianggezhuang at the three control sections of the Luan River in Tangshan City, Hebei Province, from 5 July 2018 to 26 March 2019. The minimum mean absolute percentage error (MAPE) of this method was 2.4866, and the coefficient of determination (R2) was able to reach 0.9431. The experimental results showed that the model proposed in this paper has higher prediction accuracy and generalization than the existing LSTM, GRU, and BiGRU models.