Modeling streamflow in non-gauged watersheds with sparse data considering physiographic, dynamic climate, and anthropogenic factors using explainable soft computing techniques

Charuni Madhushani,Kusal Dananjaya,I.U Ekanayake,D.P.P Meddage,Komali Kantamaneni,Upaka Rathnayake

doi:10.1016/j.jhydrol.2024.130846

Abstract

Streamflow forecasting is essential for effective water resource planning and early warning systems. Streamflow and related parameters are often characterized by uncertainties and complex behaviors. Recent studies have turned to machine learning (ML) to predict streamflow. However, many of these methods have overlooked the interpretability and causality of their predictions, which undermine the confidence of end-users in the reliability of machine learning. Besides, non-gauged basins have been receiving more attention due to the inherent risks involved in streamflow prediction. This study aims to overcome these limitations by utilizing ML to model streamflow in a non-gauged basin using anthropogenic, static physiographic, and dynamic climate variables, while also providing interpretability through the use of Shapley Additive Explanations (SHAP). Four ML algorithms were employed in this study, including Histogram Gradient Boosting (HGB), Extreme Gradient Boosting (XGB), Deep Neural Network (DNN), and Convolutional Neural Network (CNN) to forecast streamflow. XGB outperformed the other models with a correlation coefficient (R) of 0.91 for training and 0.884 for testing, along with mean absolute errors (MAE) of 0.02 for training and 0.023 for testing. Significantly, the use of SHAP provided insights into the inner workings of XGB predictions, revealing how these predictions are made. SHAP provides the feature importance, interactions among features, and dependencies. This explainable model (SHAP) is an invaluable addition to ML-based streamflow predictions and early warning systems, offering human-comprehensible interpretations. The findings of this study are specially imperative to manage flood risk factors in urban areas.

Full Text