Prediction of rivers and lakes water temperature plays an important role in hydrology, ecology, and water resources planning and management. Recently, machines learning approaches have been widely used for modelling water temperature, and the obtained results vary depending on the kind of models and the selections of the appropriates predictors. In the present paper, a new family of machines learning are proposed and compared to the famous air2stream model, using a large data set collected at 25 lakes in the northern part of Poland. The proposed models were: (i) the extremely randomized trees (ERT), (ii) the multivariate adaptive regression splines (MARS), (iii) the M5 Model tree (M5Tree), (iv) the random forest (RF), and (v) the multilayer perceptron neural network (MLPNN). The models were developed using the air temperature as input variables and the component of the Gregorian calendar (year, month and day) number. Results obtained were evaluated using several statistical indices: the root mean square error (RMSE), the mean absolute error (MAE), correlation coefficient (R) and Nash-Sutcliffe efficiency coefficient (NSE). Obtained results reveals that the air2stream model outperformed all other machines learning models and worked best with high accuracy at all the 25 lakes, and none of the ERT, MARS, M5Tree, RF and MLPNN models was able to provides an improvement of the water temperature prediction compared to the air2stream.
Read full abstract