Accurate estimation of dew point temperature (T<sub>dew</sub>) plays a very important role in the fields of water resource management, agricultural engineering, climatology and energy utilization. However, there are few studies on the applicability of local T<sub>dew</sub> algorithms at regional scales. This study evaluated the performance of a new machine learning algorithm, i.e., gradient boosting on decision trees with categorical features support (CatBoost) to estimate daily T<sub>dew</sub> using limited local and cross-station meteorological data. The random forests (RF) algorithm was also assessed for comparison. Daily meteorological data from 2016 to 2019, including maximum, minimum and average temperature (T<sub>max</sub>, T<sub>min</sub> and T<sub>mean</sub>), maximum, minimum and average relative humidity (RH<sub>max</sub>, RH<sub>min</sub> and RH<sub>mean</sub>), maximum, minimum and average global solar radiation (Rs<sub>max</sub>, Rs<sub>min</sub> and Rs<sub>mean</sub>) from three weather stations in Hunan of China were used to evaluate the CatBoost and RF algorithms. The results showed that both algorithms achieved satisfactory estimation accuracy at the target stations (on average RMSE = 1.020°C, R<sup>2 </sup>=<sup> </sup>0.969, MAE = 0.718°C and NRMSE = 0.087) in the absence of complete meteorological parameters (with only temperature data as input). The CatBoost algorithm (on average RMSE = 1.900°C and R<sup>2 </sup>=<sup> </sup>0.835) was better than the RF algorithm (on average RMSE = 2.214°C and R<sup>2 </sup>=<sup> </sup>0.828). The accuracy and stability of the CatBoost and RF algorithms were positively correlated with the number of input parameters, and the three-parameter algorithms achieved higher estimation accuracy than the two-parameter algorithms. The developed methodology is helpful to predict T<sub>dew</sub> at regional scale.
Read full abstract