Abstract

In ungauged basins, long short-term memory (LSTM) networks provide unparalleled precision in prediction. Using k-fold validation, we trained and tested various LSTMs on 531 basins from the CAMELS data set, allowing us to make predictions in basins with no training data. The training and test data set contained 30 years of daily rainfall-runoff data from US catchments ranging in size from 4 to 2,000 km2, with aridity indexes ranging from 0.22 to 5.20, and 12 of the 13 IGPB vegetated land cover classes. Over a 15-year validation period, this effectively "ungauged" model was compared to the Sacramento Soil Moisture Accounting (SAC-SMA) model as well as the NOAA National Water Model reanalysis. Each basin's SAC-SMA was calibrated separately using 15 years of daily data. Across the 531 basins, the out-of-sample LSTM exhibited greater median Nash-Sutcliffe Efficiencies (0.69) than either the calibrated SAC-SMA (0.64) or the National Water Model (0.64). (0.58). This means that there is usually enough information in available catchment attributes data about similarities and differences between catchment-level rainfall-runoff behaviors to generate out-of-sample simulations that are generally more accurate than current models under ideal (i.e., calibrated) conditions. We discovered evidence that adding physical restrictions to the LSTM models improves simulations, which we believe should be the focus of future physics-guided machine learning research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call