A reliable pan evaporation (Epan) estimation over a daily scale is vital for sustainable water and agriculture management, especially for designing water use allocations, irrigation system management, and demand and utilization assessments. However, designing a reliable computational methodology is a challenging task for Epan estimation due to its stochastic and random characteristics. In this study, the potential of five machine learning (ML) models namely Hammerstein-Weiner (HW), Random Forest (RF), Boosted Regression Tree (BRT), Long Short-Term Memory (LSTM), and Stepwise Linear Regression (SWLR) models were evaluated for Epan modeling in Lake Hawassa catchment, Ethiopia. Afterward, hybrid SWLR-LSTM, SWLR-RF, SWLR-BRT, and SWLR-HW models were developed to benefit from the strength of the non-linear and linear characteristics of the models. The estimation results were evaluated using determination coefficient (R2), Nash-Sutcliffe efficiency (NSE), mean absolute error (MAE) and Root Mean Square Error (RMSE), Index of Agreement (IA) and Kling-Gupta efficiency (KGE). The performance analysis result of individual models showed that the RF model led to more accurate estimation with NSE = 0.938, IA = 0.956, R2 = 0.94, KGE = 0.852, RMSE = 0.483 mm/day and Pbias = -1.193 mm/day during the validation phase. Among the hybrid models, SWLR-RF provides the best predictive performance improving SWLR, HW, LSTM, BRT and RF by 26.563 %, 21.348 %, 15.577 %, 9.091 % and 3.514 %, respectively based on the validation phase NSE value. Generally, the results of the current study demonstrated the capability of deep learning, ensemble tree and hybrid linear-nonlinear models for Epan estimation in the data-scarce catchment.
Read full abstract