Precipitation data merging via machine learning: Revisiting conceptual and technical aspects

Panagiotis Kossieris,Ioannis Tsoukalas,Luca Brocca,Hamidreza Mosaffa,Christos Makropoulos,Anca Anghelea

doi:10.1016/j.jhydrol.2024.131424

Panagiotis Kossieris, Ioannis Tsoukalas + Show 4 more

Open Access

https://doi.org/10.1016/j.jhydrol.2024.131424

Copy DOI

Export

Save

Cite

Journal: Journal of Hydrology	Publication Date: May 25, 2024
Citations: 3	License type: cc-by-nc

Abstract
Full-Text
Similar Papers

Abstract

Listen

The development of accurate precipitation products of high spatio-temporal coverage is crucial for a wide range of applications. In this context, precipitation data merging (PDM), which entails the blending of satellite-based estimates with ground-based measurements, holds a prominent position, while currently there is an increasing trend in the deployment of machine learning (ML) algorithms in such endeavors. In the light of recent advances in the field, this work discusses key aspects of the PDM problem associated with: a) the conceptual formulation of the problem, that is closely related to the training of ML models and their predictive capacity, b) the selection of products fused, that is associated with the latency of final product and operational applicability of the method, c) the efficiency of single-step and two-step merging approaches, with the former one treating the problem via only regression algorithms, and the latter one via the combined use of classification and regression algorithms. By formulating PDM as a spatio-temporal prediction problem, we define and assess two different training strategies for the ML models, termed as full and per time step strategy, which entail the building of a single or several ML models, respectively. Furthermore, the performance of the full training strategy, which allows the development of predictions in both spatial and temporal dimensions, is assessed in the context of single-step and two-step merging. In each of the three scenarios, three popular ensemble tree-based ML algorithms are employed, i.e., the random forest, gradient boosting and extreme gradient boosting algorithm, resulting in nine merged products. To provide empirical evidence, we employ a datacube composed by ground-based daily precipitation observations, satellite-based and reanalysis estimates, as well as auxiliary covariates, from 1009 uniformly distributed cells (representative of a sampling area of 25 × 25 km), over four countries around the world (Australia, USA, India and Italy). The large-scale experiment indicates that: (i) full training strategy is a competitive alternative to the per time step strategy, since it enables the development of methods with improved accuracy, with respect to performance metrics and reproduction of statistics, but also with higher predictive capability and operational applicability, (ii) two-step merging enables a much better reproduction of precipitation occurrence characteristics, as reflected in the improvement of relevant categorical metrics, the reproduction of probability and autocorrelation coefficient, (iii) no significant difference was noticed in the performance of different ML algorithms.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Precipitation data merging via machine learning: Revisiting conceptual and technical aspects

Abstract

Published Version

Talk to us

Similar Papers

More From: Journal of Hydrology

Lead the way for us

Similar Papers

Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data.
Haohui Lu ... Shahadat Uddin
PLOS ONE | VOL. 19
Haohui Lu, et. al.Haohui Lu ... Shahadat Uddin
18 Apr 2024
PLOS ONE | VOL. 19

Evaluation of Tree-Based Ensemble Machine Learning Models in Predicting Stock Price Direction of Movement
Ernest Kwame Ampomah ... Zhiguang Qin
Information | VOL. 11
Ernest Kwame Ampomah, et. al.Ernest Kwame Ampomah ... Zhiguang Qin
20 Jun 2020
Information | VOL. 11

Do You Consent to the Use of Your Biological Data for Training ML and AI Models? Online Survey Targeting Clinicians and Researchers.
Yury Rusinovich ... Volha Rusinovich
Web3 Journal: ML in Health Science | VOL. 1
Yury Rusinovich, et. al.Yury Rusinovich ... Volha Rusinovich
27 Jan 2024
Web3 Journal: ML in Health Science | VOL. 1

A Radial Visualisation for Model Comparison and Feature Identification
Jianlong Zhou ... Weidong Huang
-
Jianlong Zhou, et. al.Jianlong Zhou ... Weidong Huang
08 May 2020
08 May 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Precipitation data merging via machine learning: Revisiting conceptual and technical aspects

Abstract

Published Version

Talk to us

Similar Papers

More From: Journal of Hydrology