Abstract

Accurate medium and long-term precipitation forecasting plays a vital role in disaster prevention and mitigation and rational allocation of water resources. In recent years, there are various methods for medium- and long-term precipitation forecasting based on machine learning algorithms. However, machine learning has a high demand for the size of sample data. Therefore, this article proposes a data augmentation algorithm based on the K-means clustering algorithm and synthetic minority oversampling technique (SMOTE), which can effectively enhance sample information. Besides, through constructing random forest (RF), extreme gradient boosting (XGB), recurrent neural network (RNN), and long short-term memory (LSTM) are, respectively, constructed as the models to forecast monthly grid precipitation of the Danjiangkou River Basin. This study aims to improve the accuracy of medium- and long-term precipitation forecasting. The main results are the following two aspects: 1) in most years, the anomaly correlation coefficient and Pg score of SMOTE-km-XGB and SMOTE-km-RF exceed that of XGB and RF; furthermore, compared with the other three methods, SMOTE-km-XGB method is more suitable for precipitation forecasting in the studied basin in this article; and 2) the forecasting results of two deep learning methods (RNN and LSTM) show that the sample data processed by the K-means clustering algorithm and SMOTE data augmentation algorithm have not achieved considerable results in deep learning. This study improves the accuracy of precipitation forecast by expanding and balancing the information of sample data, and provides a new research idea for improving the accuracy of medium- and long-term hydrological forecasting.

Highlights

  • M EDIUM- and long-term precipitation forecasting is an important part of hydrological science, and always plays a key role in flood control, disaster reduction, and the comprehensive utilization of water resources

  • Based on the aforementioned background, taking the Danjiangkou River Basin as the study area, this article constructs a data augmentation algorithm based on the K-means clustering algorithm and synthetic minority oversampling technique (SMOTE) to expand the precipitation series

  • The differences, advantages, and disadvantages of prediction results between shallow machine learning (ML) and deep learning models are discussed in depth

Read more

Summary

Introduction

M EDIUM- and long-term precipitation forecasting is an important part of hydrological science, and always plays a key role in flood control, disaster reduction, and the comprehensive utilization of water resources. With the growth of the forecast period, the influencing factors of medium- and long-term precipitation forecasting increasingly lead to more uncertainties in forecasting and cause a decrease in the forecasting accuracy. This has always been a difficult point in the field of precipitation forecasting. With the rapid development of computer technology, the machine learning (ML) method based on Big Data mining technology has been gradually applied to medium- and long-term precipitation forecasts because of its high generalization ability and strong robustness.

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call