Comment on hess-2021-642

Oscar Manuel Baez Villanueva

doi:10.5194/hess-2021-642-rc2

Abstract

Although many multi-source precipitation products (MSPs) with high spatio-temporal resolution have been extensively used in water cycle research, they are still subject to considerable uncertainties due to the spatial variability of terrain. Effective detection of precipitation occurrence is the key to enhancing precipitation accuracy. This study presents a two-step merging strategy to incorporate MSPs (GSMaP, IMERG, TMPA, PERSIANN-CDR, CMORPH, CHIRPS, and ERA-Interim) and rain gauges to improve the precipitation capture capacity and precipitation intensity simultaneously during 2000–2017 over China. Multiple environment variables and the spatial autocorrelation between precipitation observations are selected as auxiliary variables in the merging process. Three machine learning (ML) classification and regression models, including gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and random forest (RF), are adopted and compared. The strategy first employs classification models to identify wet and dry days in warm and cold seasons, then combines regression models to predict precipitation amounts based on wet days. The results are also compared with those of traditional methods, including multiple linear regression (MLR), ML regression models, and gauge-based Kriging interpolation. A total of 1680 (70 %) rain gauges are randomly chosen for model training and 692 (30 %) for performance evaluation. The results show that: (1) The multi-sources merged precipitation products (MSMPs) perform better than original MSPs in detecting precipitation occurrence under different intensities, followed by Kriging. The average Heidke Skill Score (HSS) of MSPs, Kriging, and MSMPs is 0.30–0.69, 0.71, 0.79–0.8, respectively. (2) The proposed method significantly alleviates the bias and deviation of original MSPs in temporal and spatial. The MSMPs strongly correlate with gauge observations with the CC of 0.85. Moreover, the modified Kling-Gupta efficiency (KGE) improves by 17 %–62 % (MSMPs: 0.74–0.76) compared with MSPs (0.34–0.65). (3) The spatial autocorrelation factor (KP) is the most important variable in models, which contributes considerably to improving the model accuracy. (4) The proposed method outperforms MLR and ML regression models, and XGBoost algorithm is more recommended for large-scale data merging owing to its high computational efficiency. This study provides a robust and reliable method to improve the performance of precipitation data with full consideration of multi-source information. This method could be applied globally and produce large-scale precipitation products if rain gauges are available.

Highlights

As one of the critical parameters of the natural water cycle, precipitation helps us realistically understand the interaction 35 between hydrological and climate systems
This study proposes a two-step merging strategy to simultaneously enhance the precipitation discrimination ability and precipitation intensity over China by incorporating multi-source precipitation products (MSPs) and relatively high-density rain gauges based on machine learning (ML) algorithms
The categorical metrics focus on analyzing the ability of products to capture precipitation events, including the probability of detection (POD), false 310 alarm ratio (FAR), critical success index (CSI), Precision, frequency bias (FB), Heidke Skill Score (HSS), and classification accuracy (Accuracy)

Summary

10 Abstract

Many multi-source precipitation products (MSPs) with high spatio-temporal resolution have been extensively used in water cycle research, they are still subject to considerable uncertainties due to the spatial variability of terrain. The results are compared with those of 20 traditional methods, including multiple linear regression (MLR), ML regression models, and gauge-based Kriging interpolation. The results show that: (1) The multi-sources merged precipitation products (MSMPs) perform better than original MSPs in detecting precipitation occurrence under different intensities, followed by Kriging. (4) The proposed method outperforms MLR and ML regression models, and XGBoost algorithm is more recommended for large-scale data merging owing to its high 30 computational efficiency. This study provides a robust and reliable method to improve the performance of precipitation data with full consideration of multi-source information. This method could be applied globally and produce large-scale precipitation products if rain gauges are available

Introduction

Environment variables

Data preprocessing

A two-step merging strategy

XGBoost

Performance evaluation and comparison

Performance assessment for classification results

Performance assessment for regression results

Variable importance of ML models

Comparison of prediction accuracy of various merging approaches

Conclusion

Findings

585 References

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Comment on hess-2021-642

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Reply on RC1
Huajin Lei
-
Huajin LeiHuajin Lei
06 Apr 2022
06 Apr 2022

Comment on hess-2021-642
-
-
--
07 Mar 2022
Comment on hess-2021-642
-

A two-step merging strategy for incorporating multi-source precipitation products and gauge observations using machine learning classification and regression over China
Tianqi Ao ... Huajin Lei
Hydrology and Earth System Sciences | VOL. 26
Tianqi Ao, et. al.Tianqi Ao ... Huajin Lei
15 Jun 2022
Hydrology and Earth System Sciences | VOL. 26

Reply on RC2
Huajin Lei
-
Huajin LeiHuajin Lei
06 Apr 2022
06 Apr 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Comment on hess-2021-642

Abstract

Highlights

Summary

Talk to us

Similar Papers