Abstract. This study investigates the ability of long short-term memory (LSTM) neural networks to perform streamflow prediction at ungauged basins. A set of state-of-the-art, hydrological model-dependent regionalization methods are applied to 148 catchments in northeast North America and compared to an LSTM model that uses the exact same available data as the hydrological models. While conceptual model-based methods attempt to derive parameterizations at ungauged sites from other similar or nearby catchments, the LSTM model uses all available data in the region to maximize the information content and increase its robustness. Furthermore, by design, the LSTM does not require explicit definition of hydrological processes and derives its own structure from the provided data. The LSTM networks were able to clearly outperform the hydrological models in a leave-one-out cross-validation regionalization setting on most catchments in the study area, with the LSTM model outperforming the hydrological models in 93 % to 97 % of catchments depending on the hydrological model. Furthermore, for up to 78 % of the catchments, the LSTM model was able to predict streamflow more accurately on pseudo-ungauged catchments than hydrological models calibrated on the target data, showing that the LSTM model's structure was better suited to convert the meteorological data and geophysical descriptors into streamflow than the hydrological models even calibrated to those sites in these cases. Furthermore, the LSTM model robustness was tested by varying its hyperparameters, and still outperformed hydrological models in regionalization in almost all cases. Overall, LSTM networks have the potential to change the regionalization research landscape by providing clear improvement pathways over traditional methods in the field of streamflow prediction in ungauged catchments.
Read full abstract