How do geological map details influence the identification of geology-streamflow relationships in large-sample hydrology studies?
Abstract. Large-sample hydrology datasets have advanced hydrological research, yet the impact of landscape map details on identifying dominant streamflow generation processes remains underexplored. This study investigates the role of geology using maps of increasing detail – global, continental, and regional – each reclassified into four permeability classes. These geological attributes were used along with topography, soil, land use, and climate attributes to identify dominant controls on streamflow signatures across 4469 European catchments. To distinguish landscape influences from the otherwise dominant influence of climate, we conducted separate analyses on nested basins. Three scales were considered to assess scale-dependent patterns: large (63 nested basins), intermediate (the Moselle nested basin), and small (five nested catchments within the Moselle). The large-scale study used geology information from global and continental maps, while the others also incorporated regional maps. At the large scale, dominant controls varied widely between nested basins, but landscape generally outweighed climate, highlighting the value of our nested basin design. At this scale, continental and global geology maps produced different correlation patterns, with neither consistently superior. At the intermediate scale, increased geological detail led geology to shift from the least to the most correlated variable for certain streamflow signatures. The small-scale experiment reinforced these findings, as the regional map highlighted controls more consistent with process understanding. This study underscores the benefit of integrating detailed, region-specific geological data into large sample hydrology studies, and demonstrates the utility of a nested basins design. These findings have important implications for hydrological regionalization and streamflow prediction in ungauged basins.
- Research Article
11
- 10.1080/15715124.2023.2245809
- Aug 17, 2023
- International Journal of River Basin Management
Streamflow prediction in ungauged basins particularly within data-scarce areas is a challenging and sensitive task. Traditionally, conceptual and physical models have been utilized to deal with this task. While there have been many studies based on machine learning models and in particular deep learning techniques, with recent advances in machine learning, it is imperative that the hydrologic community take further advantage of data-driven machine learning techniques to address the challenge of streamflow prediction in ungauged basins. Perhaps difficulty of incorporating expert physical/hydrological knowledge in the modelling process and lack of sufficient explainability for machine learning models are some of the obstacles in wider utilization of machine learning models for streamflow prediction. This paper uses XGBoost for streamflow prediction in ungauged basins located within data-scarce regions by incorporating physical and hydrological knowledge in the modelling process through feature engineering. The explainability of the models is studied using SHAP. Accordingly, three XGBoost models are evaluated based on different levels of feature engineering and a fourth model is evaluated by adding a physical constraint to the third model. The four models are applied to six target catchments located in four different countries/continents with diverse hydro-climatic conditions. The performance of the models is compared against previous studies including against the SPED framework which is based on a conceptual hydrological model and available data/knowledge about the reference and target catchments. The second XGBoost model proves to be the most plausible model which outperforms the previous studies or does comparably in five of the target catchments (Nash-Sutcliffe Efficiency range of 0.61–0.81, where 1 indicates a perfect match between observations and predictions). However, in North Fork Cache Creek in the United States where the target catchment is quite different from the reference catchment in terms of magnitude of low flows, this model fails to provide satisfactory streamflow predictions.
- Research Article
1
- 10.1016/j.heliyon.2025.e42512
- Feb 1, 2025
- Heliyon
Data-driven model as a post-process for daily streamflow prediction in ungauged basins.
- Research Article
10
- 10.1029/2022wr031929
- Jul 1, 2023
- Water Resources Research
Calibration of precipitation‐streamflow models to streamflow signatures is a promising approach for streamflow prediction in ungauged basins (PUB). The estimation of parameter and prediction uncertainty in this case is not trivial because: (a) calibration takes place in the signature domain, while predictions are required in the time domain, and (b) streamflow signatures are estimated (e.g., from donor catchments) rather than “observed” (computed from observed streamflow in the target catchment), and therefore particularly uncertain. This study investigates model calibration using estimated signatures in an Approximate Bayesian Computation framework. First, we construct a stochastic signature transfer model, based on seasonal flow duration curves. Second, we calibrate a precipitation‐streamflow model to the estimated signatures, accounting for their uncertainty. The proposed method is tested in six catchments of the Thur basin, Switzerland. Three data availability scenarios are considered: (a) concomitant scenario, where signatures are “observed,” (b) non‐concomitant scenario, where signatures are transferred from a different time period, and (c) regionalization scenario, where signatures are transferred from (neighboring) donor catchments. In this study, the switch from observed to regionalized signatures increases predictive streamflow uncertainty by 38% and worsens the (deterministic) fit to observations by 17% (in terms of Nash‐Sutcliffe efficiency). Despite this deterioration, posterior predictive uncertainty remains lower than prior predictive uncertainty generated using uniform priors over representative parameter ranges (“uncalibrated” model), which demonstrates the effectiveness of the proposed signature‐based calibration. More importantly, uncertainty is reliably estimated at the ungauged catchments, which represents a key advance in stochastic streamflow PUB.
- Research Article
11
- 10.1051/e3sconf/202016301001
- Jan 1, 2020
- E3S Web of Conferences
Streamflow prediction is a vital public service that helps to establish flash-flood early warning systems or assess the impact of projected climate change on water management. However, the availability of streamflow observations limits the utilization of the state-of-the-art streamflow prediction techniques to the basins where hydrometric gauging stations exist. Since the most river basins in the world are ungauged, the development of the specialized techniques for the reliable streamflow prediction in ungauged basins (PUB) is of crucial importance. In recent years, the emerging field of deep learning provides a myriad of new models that can breathe new life into the stagnating PUB methods. In the presented study, we benchmark the streamflow prediction efficiency of Long Short-Term Memory (LSTM) networks against the standard technique of GR4J hydrological model parameters regionalization (HMREG) at 200 basins in Northwest Russia. Results show that the LSTM-based regional hydrological model significantly outperforms the HMREG scheme in terms of median Nash-Sutcliffe efficiency (NSE), which is 0.73 and 0.61 for LSTM and HMREG, respectively. Moreover, LSTM demonstrates the comparable median NSE with that for basin-scale calibration of GR4J (0.75). Therefore, this study underlines the high utilization potential of deep learning for the PUB by demonstrating the new state-of-the-art performance in this field.
- Preprint Article
- 10.5194/egusphere-egu24-6432
- Nov 27, 2024
In addressing the challenge of streamflow prediction in ungauged basins, this study leveraged deep learning (DL) models, especially long short-term memory (LSTM) networks, to predict streamflow for pseudo ungauged basins in Japan. The motivation stems from the recognized limitations of traditional hydrological models in transferring their performance beyond the calibrated basins. Recent research suggests that DL models, especially those trained on multiple catchments, demonstrate improved predictive capabilities utilizing the concept of streamflow regionalization. However, the majority of these studies were confined to geographic regions within the United States.For this study, a total number of 211 catchments were delineated and investigated, distributed across all four primary islands of Japan (Kyushu - 32, Shikoku - 13, Honshu - 127, and Hokkaido - 39) encompassing a comprehensive sample of hydrological systems within the region. The catchments were obtained corresponding to the streamflow observation points and their combined area represented more than 43% of Japan's total land area, after accounting for overlaps. After cleaning and refining the streamflow dataset, the remaining catchments (198) were divided into training (~70%), validation (~20%), and holdout test (~10%) sets. A combination of dynamic (time-varying) and static (constant) variables were obtained on a daily basis corresponding to the daily streamflow data and provided to the models as input features. However, the final model accorded higher significance to dynamic features in comparison to the static ones. Although the models were trained on daily time steps, the results were aggregated to monthly timescale. The main evaluation metrics included the Nash-Sutcliffe Efficiency (NSE) and Pearson&#8217;s correlation coefficient (r). The final model achieved a median NSE of 0.96, 0.83, &amp; 0.78, and a median correlation of 0.98, 0.92, &amp; 0.91 corresponding to the training, validation, and test catchments, respectively. For the validation catchments, 90% exhibited NSE values greater than 0.50, and 97% demonstrated a correlation surpassing 0.70. Correspondingly, these proportions were observed at 77% and 91% for the test catchments.The results presented in this study demonstrate the feasibility and efficacy of developing a data-driven model for streamflow prediction in ungauged basins utilizing streamflow regionalization. The final model exhibits commendable performance, as evidenced by high NSE and correlation coefficients across the majority of the catchments. Importantly, the model's ability to generalize to unseen data is highlighted by its remarkable performance on the holdout test set, with only a few instances of lower NSE values (< 0.50) and correlation coefficients (< 0.70).
- Research Article
16
- 10.2166/nh.2016.094
- Oct 6, 2016
- Hydrology Research
There are different views on the selection of hydrological model structural complexity for streamflow prediction in ungauged basins. Some studies suggest that complex models are better than simple models due to the former's prediction capability; whereas some studies favor parsimonious model structures to overcome a risk of over-parameterization. The Xinanjiang (XAJ) model, the most widely used hydrological model in China, has two different versions, as follows: (1) the simple version with seven parameters (XAJ7) and (2) the complex version with 14 parameters (XAJ14). In this study, the two versions of the XAJ model were comprehensively evaluated for streamflow prediction in ungauged basins based on their efficiency, parameter identifiability, and independence. The results showed that the complex XAJ14 model was more flexible than the simple XAJ7 in calibration mode; while the two models have similar performance in validation and regionalization modes. Lack of parameter identifiability and the presence of parameter interdependence most likely explain why the complex XAJ14 cannot consistently outperform the XAJ7 in different modes. Therefore, the simple XAJ7 is a better choice than XAJ14 for streamflow prediction in ungauged basins.
- Research Article
15
- 10.2166/nh.2015.155
- Dec 29, 2015
- Hydrology Research
Streamflow information is of great significance for flood control, water resources utilization and management, ecological services, etc. Continuous streamflow prediction in ungauged basins remains a challenge, mainly due to data paucity and environmental changes. This study focuses on the modification of a nonlinear hydrological system approach known as the time variant gain model and the development of a regressive method based on the modified approach. This method directly correlates rainfall to runoff through physically based mathematical transformations without requiring additional information of evaporation or soil moisture. Also, it contains parsimonious parameters that can be derived from watershed properties. Both characteristics make this method suitable for practical uses in ungauged basins. The Huai River Basin of China was selected as the study area to test the regressive method. The results show that the proposed methodology provides an effective way to predict streamflow of ungauged basins with reasonable accuracy by incorporating regional watershed information (soil, land use, topography, etc.). This study provides a useful predictive tool for future water resources utilization and management for data-sparse areas or watersheds with environmental changes.
- Research Article
6
- 10.1016/j.jhydrol.2024.131357
- May 19, 2024
- Journal of Hydrology
Streamflow prediction in ungauged basins: How dissimilar are drainage basins?
- Research Article
33
- 10.1080/02626667.2015.1117088
- Jul 6, 2016
- Hydrological Sciences Journal
ABSTRACTThis paper assesses the possibility of using multi-model averaging techniques for continuous streamflow prediction in ungauged basins. Three hydrological models were calibrated on the Nash-Sutcliffe Efficiency metric and were used as members of four multi-model averaging schemes. Model weights were estimated through optimization on the donor catchments. The averaging methods were tested on 267 catchments in the province of Québec, Canada, in a leave-one-out cross-validation approach. It was found that the best hydrological model was practically always better than the others used individually or in a multi-model framework, thus no averaging scheme performed statistically better than the best single member. It was also found that the robustness and adaptability of the models were highly influential on the models’ performance in cross-verification. The results show that multi-model averaging techniques are not necessarily suited for regionalization applications, and that models selected in such studies must be chosen carefully to be as robust as possible on the study site.Editor M.C. Acreman; Associate editor S. Grimaldi
- Research Article
23
- 10.1016/j.ecoleng.2022.106699
- Jun 11, 2022
- Ecological Engineering
Utilization of the Long Short-Term Memory network for predicting streamflow in ungauged basins in Korea
- Research Article
29
- 10.1029/2022wr034352
- Jul 1, 2023
- Water Resources Research
Streamflow prediction in ungauged basins (PUB) is challenging, and Long Short‐Term Memory (LSTM) is widely used to for such predictions, owing to its excellent migration performance. Traditional LSTM forced by meteorological data and catchment attribute data barely highlight the optimum data integration strategy for LSTM and its migration from data‐rich basins to ungauged ones. In this study, we experimented with 1,897 global catchments and found that LSTM‐corrected Global Hydrological Models (GHMs) outperformed uncorrected GHMs, improving the median Nash‐Sutcliff efficiency (NSE) from 0.03 to 0.66. Notably, there was a large gap between traditional LSTM modeling in ungauged basins and autoregressive modeling in data‐rich basins, and GHM‐forced LSTM were an effective way to close this gap in ungauged basins. The spatial heterogeneity of the performance of GHM‐forced LSTM was mainly influenced by three metrics (dryness, the leaf area index and latitude), which described the hydrological similarity among catchments. Weaker hydrological similarity among continental catchments results in larger variability in GHM‐forced LSTM, with the best performance in Siberia (NSE, 0.54) and the worst in North America (NSE, 0.10). However, the migration performance of GHM‐forced LSTM was significantly improved (NSE, 0.63) in ungauged basins when hydrological similarity was considered. This study stressed the advantages of GHM‐forced LSTM and due significance should be attached to hydrological similarities among catchments to improve hydrological prediction in ungauged catchments.
- Research Article
121
- 10.5194/hess-27-139-2023
- Jan 9, 2023
- Hydrology and Earth System Sciences
Abstract. This study investigates the ability of long short-term memory (LSTM) neural networks to perform streamflow prediction at ungauged basins. A set of state-of-the-art, hydrological model-dependent regionalization methods are applied to 148 catchments in northeast North America and compared to an LSTM model that uses the exact same available data as the hydrological models. While conceptual model-based methods attempt to derive parameterizations at ungauged sites from other similar or nearby catchments, the LSTM model uses all available data in the region to maximize the information content and increase its robustness. Furthermore, by design, the LSTM does not require explicit definition of hydrological processes and derives its own structure from the provided data. The LSTM networks were able to clearly outperform the hydrological models in a leave-one-out cross-validation regionalization setting on most catchments in the study area, with the LSTM model outperforming the hydrological models in 93 % to 97 % of catchments depending on the hydrological model. Furthermore, for up to 78 % of the catchments, the LSTM model was able to predict streamflow more accurately on pseudo-ungauged catchments than hydrological models calibrated on the target data, showing that the LSTM model's structure was better suited to convert the meteorological data and geophysical descriptors into streamflow than the hydrological models even calibrated to those sites in these cases. Furthermore, the LSTM model robustness was tested by varying its hyperparameters, and still outperformed hydrological models in regionalization in almost all cases. Overall, LSTM networks have the potential to change the regionalization research landscape by providing clear improvement pathways over traditional methods in the field of streamflow prediction in ungauged catchments.
- Preprint Article
- 10.5194/egusphere-egu25-527
- Mar 18, 2025
Large-sample hydrology (LSH) datasets have advanced hydrological research by enabling studies across a wide range of catchments. Yet, the impact of landscape attributes included in such datasets on their ability to inform perceptual understanding of catchment behaviour remains underexplored. Here we investigate how the level of detail in maps used to derive catchment-scale geological attributes influences their correlation with streamflow signatures. For this, we used a set of streamflow signatures, and climate and landscape attributes available from the recently released EStreams dataset, alongside geological attributes derived from three geology maps of varying levels of detail: global, continental, and regional. These maps are perceived to have increasing levels of accuracy and were reclassified into four permeability classes. In order to explore scale-dependent effects, we moved from breadth to depth, that is, from a broad continental scale with less detailed analyses to a finer sub-catchment setting with more detailed investigations. We found that the correlation between streamflow signatures and geology attributes generally increased when using more detailed geological maps, drastically changing the perception of the importance of geology in influencing catchment behaviour relative to other landscape properties. In the Moselle catchment, a global geology map with other catchment attributes (e.g., climate and soils) failed to capture regional variations in many streamflow signatures. Moving to the sub-catchment level, we observed that smaller, nested sub-catchments exhibited unique correlation patterns, particularly for the baseflow index, emphasizing the nuanced controls at finer scales. Overall, regional and continental maps generally captured geological details better than global maps. This was particularly evident in areas with heterogeneous rock types, where global maps often oversimplified rock classifications. These findings underscore the importance of region-specific characteristics, which become even more pronounced at local scales, and play a crucial role in detecting meaningful correlations. This has implications for hydrological regionalization and predictions in ungauged catchments, suggesting that integrating high-quality, region-specific geological data into LSH studies is essential for accurate predictions and deeper insights into dominant streamflow generation processes.
- Preprint Article
- 10.5194/egusphere-egu2020-5085
- Mar 23, 2020
&lt;p&gt;One of the open challenges in catchment hydrology is prediction in ungauged basins (PUB), i.e. being able to predict catchment responses (typically streamflow) when measurements are not available. One of the possible approaches to this problem consists in calibrating a model using catchment response statistics (called signatures) that can be estimated at the ungauged site.&lt;br&gt;An important challenge of any approach to PUB is to produce reliable and precise predictions of catchment response, with an accurate estimation of the uncertainty. In the context of PUB through calibration on regionalized streamflow signatures, there are multiple sources of uncertainty that affect streamflow predictions, which relate to:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;The use streamflow signatures, which, by synthetizing the underlying time series, reduce the information available for model calibration;&lt;/li&gt; &lt;li&gt;The regionalization of streamflow signatures, which are not observed, but estimated through some signature regionalization model;&lt;/li&gt; &lt;li&gt;The use of a rainfall-runoff model, which carries uncertainties related to input data, parameter values, and model structure.&lt;/li&gt; &lt;/ul&gt;&lt;p&gt;This study proposes an approach that separately accounts for the uncertainty related to the regionalization of the signatures from the other types; the implementation uses Approximate Bayesian Computation (ABC) to infer the parameters of the rainfall-runoff model using stochastic streamflow signatures.&amp;#160;&lt;br&gt;The methodology is tested in six sub-catchments of the Thur catchment in Switzerland; results show that the regionalized model produces streamflow time series that are similar to the ones obtained by the classical time-domain calibration, with slightly higher uncertainty but similar fit to the observed data. These results support the proposed approach as a viable method for PUB, with a focus on the correct estimation of the uncertainty.&lt;/p&gt;
- Research Article
20
- 10.2166/nh.2013.099
- Dec 11, 2013
- Hydrology Research
Geomorphology-based rainfall–runoff models are particularly helpful for predicting hydrology in ungauged basins. The robustness, generality and flexibility of the modelling approach make it able to deal with a wide variety of processes, events and scales. It allows a rainfall–runoff transfer function to be estimated for any basin without needing to measure discharge. The aim of this study is to transpose hydrological observations from gauged to ungauged basins to predict streamflow hydrographs. It considers pairs of nested and neighbouring basins, the first one providing information for the second ungauged one. A time-series of the donor basin's discharge is deconvoluted by inverting its geomorphology-based transfer function to assess the time-series of net rainfall. The latter is then transposed to the receiver basin, where it is convoluted with the reciever basin's transfer function to predict the hydrograph therein. The methodology was implemented with virtual and real rainfall–runoff events on a set of basins in temperate Brittany, France. Different time scales and spatial configurations were tested. Goodness-of-fit of model predictions varied by basin pair. High prediction accuracy was observed when transposing hydrographs between nested basins differing greatly in size. Several ways to improve the approach are identified by relaxing simplifying assumptions.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.