Estimating Rainfall with Multi-Resource Data over East Asia Based on Machine Learning

Yushan Zhang,Jianyin Zhou,Haixia Xiao,Feng Zhang,Liang Peng,Yi Song,Fuchang Wang,Jinglin Zhang,Kun Wu

doi:10.3390/rs13163332

Abstract

The lack of accurate estimation of intense precipitation is a universal limitation in precipitation retrieval. Therefore, a new rainfall retrieval technique based on the Random Forest (RF) algorithm is presented using the Advanced Himawari Imager-8 (Himawari-8/AHI) infrared spectrum data and the NCEP operational Global Forecast System (GFS) forecast information. And the gauge-calibrated rainfall estimates from the Global Precipitation Measurement (GPM) product served as the ground truth to train the model. The two-step RF classification model was established for (1) rain area delineation and (2) precipitation grades’ estimation to improve the accuracy of moderate rain and heavy rain. In view of the imbalance categories’ distribution in the datasets, the resampling technique including the Random Under-sampling algorithm and Synthetic Minority Over-sampling Technique (SMOTE) was implemented throughout the whole training process to fully learn the characteristics among the samples. Among the features used, the contributions of meteorological variables to the trained models were generally greater than those of infrared information; in particular, the contribution of precipitable water was the largest, indicating the sufficient necessity of water vapor conditions in rainfall forecasting. The simulation results by the RF model were compared with the GPM product pixel-by-pixel. To prove the universality of the model, we used independent validation sets which are not used for training and two independent testing sets with different periods from the training set. In addition, the algorithm was validated against independent rain gauge data and compared with GFS model rainfall. Consequently, the RF model identified rainfall areas with a Probability Of Detection (POD) of around 0.77 and a False-Alarm Ratio (FAR) of around 0.23 for validation, as well as a POD of 0.60–0.70 and a FAR of around 0.30 for testing. To estimate precipitation grades, the value of classification was 0.70 in validation and in testing the accuracy was 0.60 despite a certain overestimation. In summary, the performance on the validation and test data indicated the great adaptability and superiority of the RF algorithm in rainfall retrieval in East Asia. To a certain extent, our study provides a meaningful range division and powerful guidance for quantitative precipitation estimation.

Highlights

Precipitation is one of the most important indicators that reflects global and regional climate system changes
We removed the clear sky pixels from the dataset used in the first module, analogously, we developed the second module to estimate the grades of precipitation, and the model was used to confirm if the pixels were light rain, moderate rain, or heavy rain on the geostationary satellite remote sensing images
This section summarizes the results of the tuning and training for the two-class precipitation to identify the rain area, and the metrics are applied in the validation dataset and testing dataset to assess the model overall performance

Summary

Introduction

Precipitation is one of the most important indicators that reflects global and regional climate system changes. Satellites can make comprehensive observations and provide intuitive remote sensing image information; satellite rainfall products are superior and have good prospects to improve the ability of monitoring grid rainfall [11]. Satellite rainfall products are retrieved by microwave remote sensing, visible/infrared remote sensing, and multisensor rainfall estimation, with microwave sensors mainly carried by polar-orbiting satellites, such as NOAA, METOP, and FY-3. Since the new generation of geostationary meteorological satellites (such as GOES, MSG, Himawari-8, and FY-4) has been launched, visible/infrared remote sensing can provide high spatiotemporal resolution rainfall products. The rainfall accuracy still needs to be improved as a result of the indirect relationship between the infrared signal and precipitation [12,13,14]

Methods

Results

Conclusion