Accurate estimation of precipitation at local to global scales can considerably enhance our understanding of climate system dynamics. While numerous precipitation products are available as indispensable tools for investigating precipitation and its associated processes, none can consistently provide the lowest estimation error across environmental conditions. The multiple source precipitation ensemble (MSPE) methods have been considered a vital solution. A new MSPE framework is proposed here, which simultaneously uses machine learning (ML) classification and regression techniques within an automatic workflow (MSPEaml). Six precipitation products and their ensembles based on different MSPE strategies were evaluated at 2365 gauged and 800 randomly selected ungauged sites over China. Results revealed significant precision inconsistencies among the products primarily due to their different data sources and retrieval algorithms; while MSPEaml can effectively reduce the random and classification errors of estimated precipitation according to the Kling-Gupta efficiency and Heidke skill score. The improvements demonstrated the unique features of MSPEaml, particularly the necessity of the joint use of ML classifiers and regressors and assigning spatiotemporal dynamic weights for merging precipitation data. Moreover, MSPEaml can substantially improve its generalizability through a simple binning procedure, making it applicable under more complex conditions. The varying contributions of predictor variables (indicated by Shapely values) in different ML models identified the complexity of the MSPE issue and further the importance of designing proper ML models according to specific targets. The proposed MSPE framework is expected to be a suitable solution for assembling multiple precipitation data sources with different time periods and scales.
Read full abstract