The machine learning models (MLMs), including support vector regression (SVR), multivariate adaptive regression spline (MARS), boosted regression trees (BRT), and projection pursuit regression (PPR) are compared to traditional method i.e. nonlinear regression (NLR) in regional flood frequency analysis (RFFA). In this study, the Karun and Karkheh watersheds, which is located in the southwestern of Iran, with the same climatic and physiographic conditions are considered. Fifty-four hydrometric stations with a period of 21 years (1993–2013) are selected based on the instructions of U.S. Federal Agencies Bulletin 17 B were applied for RFFA. The generalized normal (GNO) probability distribution function (PDF) is selected by the L-moment method among five PDFs, including the GNO, generalized Pareto (GP), generalized logistic (GL), generalized extreme value (GEV) and Pearson type 3 (P ІІІ) to estimate flood discharge for the expected return periods. Twenty-five predictor variables, such as physiographic, climatologic, geologic, soil and land use variables are extracted. Follow land, maximum 24-h rainfall, mean watershed slope, compactness coefficient, mean and maximum watershed elevation variables are recognized as the appropriate combination of input using gamma test (GT). The overall results indicate that the SVR, PPR, and MARS models in comparison to the NLR and BRT models have a better performance to estimate flood discharge with the expected return periods. Future, the SVR model based on radial basis function (RBF) kernel is chosen as the best model in terms of the mean of the Nash-Sutcliff coefficient (M-Ef) and the mean of relative root mean squared error (M-RMSEr) (i.e. 0.94 and 63.93, respectively) for different return periods.
Read full abstract