AbstractEstimates of the flood quantile for ungauged watersheds are crucial for water resources management but challenging due to the nonlinear complex hydrological system. For ungauged watersheds, estimating flood quantiles relies on various interdependent physiometeorological variables, many of which are not adequately considered in regional flood frequency analysis (RFFA). In this study, we utilized the random forest (RF) and support vector regression (SVR) algorithms, which can learn the nonlinear relationship between the physiometeorological variables and flood quantiles for RFFA. Thirteen physiometeorological variables that were not collectively employed before were used to estimate the 10‐year, 50‐year, and 100‐year return period flood quantiles (Q10, Q50, and Q100), respectively, for 39 watersheds spread across India. The RF and SVR models were trained on 29 (75%) watersheds to estimate individual flood quantiles and were subsequently tested on the remaining ten (25%) ungauged watersheds. The R2 achieved by RF is 0.862, 0.813, and 0.845, and SVR is 0.807, 0.793, and 0.789 for Q10, Q50, and Q100, respectively. Overall, the results indicate that RF can effectively learn the nonlinear relationships, while SVR with a linear kernel requires further improvement to estimate reliable flood quantiles. The study demonstrates that machine learning algorithms, with appropriate physiometeorological input datasets, can be used to estimate flood quantiles even in the sparse data region.
Read full abstract