Abstract
In regional hydrological regression, a relationship is sought to predict flood statistics and/or flood quantiles from catchment characteristics predictor variables. Most often the developed regression relationships can underestimate the true prediction error thus introducing significant uncertainty with the flood estimates. An effective but simple procedure named Monte Carlo cross validation (MCCV) is applied in this paper for selecting the best regression model. Unlike the one-at-a-time validation (OAT-CV) and split sample validation procedures often used for cross validation of regression models, the MCCV shown in this paper has a larger chance of selecting the right model. MCCV can also avoid an unnecessary large model which reduces the uncertainty in over fitting the model. In the case of an established model, MCCV on average may perform poorly and may not always provide the most realistic prediction errors. More realistic estimates of prediction error for hydrological regression models can be achieved by slight modifications to the standard MCCV approach, where a correction term is added to MCCV. In this paper, the basic methods for carrying out the MCCV analysis are discussed which follows the derivation of the prediction error for an established model. To determine the true strength of the corrected MCCV (CMCCV) method, numerical experiments are undertaken. In order to explore the capabilities of MCCV and CMCCV in different applications, a set of simulated data is obtained for a specific model. MCCV and CMCCV are also tested on a real regional flood dataset for the state of New South Wales (NSW) in Australia. The results obtained from the numerical experiment show that MCCV has a larger probability than OAT-CV of selecting the model with the optimum prediction ability and correct variables, and can therefore provide more accurate results as compared to the prediction ability of OATCV. The results obtained from the NSW data set demonstrate that MCCV has the ability to select an appropriate model with 2 variables for the 10 year average recurrence interval (ARI) flood quantile, while at the same time assessing the prediction ability of the selected model with greater accuracy. The use of MCCV enables overcoming some of the difficulties in identifying the most influential variables in hydrological regression models.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have