Interpolation vs. Extrapolation in Flood Forecasting: Exploring the Predictive Capability of Conceptual and Machine Learning Tools in Non-Stationary Scenarios

Ricardo Mantilla,Vimal Sharma,Kavindra Lewkebandara,Shaoping Xiao,Faruk Gurbuz,David Muñoz,Janet Barco

doi:10.5194/egusphere-egu24-13400

Abstract

Recently published literature has confirmed time and time again that machine learning (ML) algorithms (including LSTMs, GRUs, and Transformers) and conceptual lumped hydrological models (such as SAC-SMA and HBV) perform more reliably in hindcast and forecast flood prediction intercomparison experiments than more sophisticated high-resolution hydrological models. These provocative results have challenged decades of development of physics-based hydrological models for streamflow prediction, which seem more sensitive to the errors in forcing precipitation data, and the spatial description of landscape attributes. Thus, the long-standing promise that a better and more detailed understanding and description of hydrological processes would yield better predictions of streamflow fluctuations (including floods, droughts, etc.) is yet to be fulfilled. In a recently published study by our research group, we proposed and tested a methodology to benchmark ML algorithms using artificially generated data using physically-based hydrological models under very controlled conditions. Our approach combined the implementation of the hillslope-link distributed hydrological model (HLM) on a 4,500 km2 basin driven by precipitation fields created using the stochastic storm transposition (SST) framework. We demonstrated that ML algorithms could effectively identify the input-output relations between the average rainfall over a basin and streamflows (as time series) at multiple sub-basin outlets under very general conditions of space-time variability of flood-generating storm systems. This result matches the reported performance by ML algorithms under a great variety of conditions. We are extending our work to ask a new question: How reliable are trained ML algorithms and calibrated lumped hydrological models at predicting floods that have never been observed in the &#8220;historical&#8221; record? This question goes to the heart of what these black/grey-box and conceptual types of tools represent mathematically: a deterministic estimate for the input-output relationship between rainfall and streamflow. Therefore, when any of these black-box models predicts a flood there are two possible scenarios, 1) interpolation, which means that the hydrograph and peak flow being predicted are within the range of floods observed in the past, and 2) extrapolation, the case when the event being predicted is significantly larger than anything observed in the past.&#160; In this study, we will present the results of controlled experiments to investigate this question and show which class of algorithms are less susceptible to over or under-estimation when extrapolating beyond the range of the &#8220;historical record&#8221;. We will present results for hourly and daily prediction timescales. This investigation is very relevant in the current environment of climate change where the water-holding capacity of the atmosphere increases with every degree of warming leading to storms that seem to constantly break every record in terms of intensity, duration, and spatial coverage.

Full Text