A workflow to address pitfalls and challenges in applying machine learning models to hydrology

Amr Gharib,Evan G.R Davies

doi:10.1016/j.advwatres.2021.103920

Abstract

Data-driven modeling with machine learning (ML) algorithms in hydrologic modeling and forecasting suffers a number of pitfalls and challenges, including variable selection bias, resubstitution validation, inconsistent validation processes for different algorithms and model selection using the test set. They lead to incorrect model development and biased, overly optimistic performance estimates – and thus, unreliable models. This study presents a novel model building and testing workflow that addresses common machine learning (ML) challenges and pitfalls. The presented workflow incorporates optional variable transformation and preprocessing techniques, and applies to various ML model types, variable selection algorithms and resampling techniques. We demonstrate its performance through streamflow forecasting for the Bow River Basin, Alberta, Canada, using four conventional ML algorithms (ANN, SVM, ELM and RBF networks) driven by local hydrometeorological conditions and large-scale climate indices. Using the cross-validation average out-of-sample results and a separate test set, the prediction accuracy estimate bias (the relative difference between the model performance estimated using the validation sets and a separate test set) was empirically estimated to be 5.6%, 4.4%, 2.5% and 3.0% for the Seasonal, May, June and July models, respectively. In addition, the streamflow forecasting models had an average coefficient of determination of 0.85. Preprocessing and dimensionality reduction through principal component analysis (PCA) was detrimental to prediction accuracy. Snow water equivalent from individual snow courses proved the most important predictor for Bow River streamflow, while global climate indices, including the PDO, AMO and PNA, increased the Nash–Sutcliffe efficiency by 6% to 50%. Finally, although forecasting skill decreased with increasing forecast lead time, satisfactory forecasts (NSE>0.5) could be obtained two months in advance of the spring melt, at the end of February. Extensions of this study should address the tendency of different variable selection algorithms to pick irrelevant or redundant predictors after changes in training data.

Full Text