Abstract

The article presents the results of study of the application of machine learning methods to the problem of classification and identification of different river water regimes in a large region – the European territory of Russia. An accumulation of hydrological observation data for the 60 – 80 years makes it possible to create an information basis for such studies. The article uses information on the average monthly runoff at 351 hydrological gauges during the period from 1945 to 2018. The most widely used data clustering approaches were used as analysis methods – K-means, EM-method, agglomerative hierarchical clustering, DBSCAN algorithms and the application of gradient boosting methods (CATBUST). Clustering and classification algorithms were given eight parameters as a basis for prediction. It was found that the most distinct and stable clusters are formed with three parameters, and the highest silhouette coefficient (SS = 0,3-0,5) is obtained using the numbers for months of the maximum and minimum runoff and the ratio of the maximum to the minimum water flow. The best result gives DBSCAN (SS = 0,6 – 0,7). Supervised classification models also show high correspondence with the reference classification, with an accuracy of 87%. Both clustering methods and classification methods showed a shift of clusters representing southern water regimes. In the central region these regimes expanded by a 1000 km to the north. Furthermore, results demonstrate that currently available data already makes it possible to apply machine learning methods to the analysis of hydrological data. Clusters corresponding to different types of water regime can be obtained by utilizing contemporary clustering algorithms. The study shows that over the past 40 years, the southern types of water regimes have noticeably shifted to the north.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call