Flood frequency analysis at large scales, essential for the development of flood risk maps, is hindered by the scarcity of gauge flow data. Suitable methods are thus required to predict flooding in ungauged basins, a notoriously complex problem in hydrology. We develop a Bayesian hierarchical model (BHM) based on the generalized extreme value (GEV) and the generalized Pareto distribution for regional flood frequency analysis at high resolution across a large part of North America. Our model leverages annual maximum flow data from ≈20,000 gauged stations and a dataset of 130 static catchment-specific covariates to predict extreme flows at all catchments over the continent as well as their associated statistical uncertainty. Additionally, a modification is made to the data layer of the BHM to include peaks over threshold flow data when available, which improves the precision of the discharge level estimates. We validated the model using a hold-out approach and found that its predictive power is very good for the GEV distribution location and scale parameters and improvable for the shape parameter, which is notoriously hard to estimate. The resulting discharge return levels yield a satisfying agreement when compared with the available design peak discharge from various government sources. The assessment of the covariates’ contributions to the model is also informative with regard to the most relevant underlying factors influencing flood-inducing peak flows. According to the developed aggregate importance score, the key covariates in our model are temperature-related bioindicators, the catchment drainage area and the geographical location.
Read full abstract