Abstract

AbstractThis study applies statistical methods to interpolate missing values in a data set of radiative energy fluxes at the surface of Earth. We apply Random Forest (RF) and seven other conventional spatial interpolation models to a global Surface Solar Radiation (SSR) data set. We apply three categories of predictors: climatic, spatial, and time series variables. Although the first category is the most common in research, our study shows that it is actually the last two categories that are best suited to predict the response. In fact, the best spatial variable is almost 40 times more important than the best climatic variable in predicting SSR. Furthermore, the 10‐fold cross validation shows that the RF has a Mean Absolute Error (MAE) of 10.2 Wm−2 and a standard deviation of 1.5 Wm−2. On the other hand, the average MAE of the conventional interpolation methods is 21.3 Wm−2, which is more than twice as large as the RF method, in addition to an average standard deviation of 6.4 Wm−2, which is more than four times larger than the RF standard deviation. This highlights the benefits of using machine learning in environmental research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call