Soil moisture (SM) plays a critical role in the growth and management of grain in semi-humid regions. However, little is known about how to integrate satellite data with machine learning to accurately retrieve SM information in these areas. This study compares the capability of three machine learning algorithms, Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Network (ANN), to extract SM information over the Northwest Shandong Plain using multi-phase dual-polarized Sentinel-1A satellite data. The backscattering coefficients were obtained through standard intensity and phase processing to calculate the SAR indices, and several characteristic parameters were extracted as impact factors using the Cloude-Pottier decomposition. The importance of these factors was analyzed, while the performance of each machine learning algorithm was comprehensively evaluated using the K-fold cross-validation method. The best-performing model was utilized to retrieve the spatio-temporal changes in SM in the study area. The findings indicate the following: (1) The first eigenvalue has the greatest impact on retrieval accuracy, followed by entropy, where the intensity component of Shannon's entropy is more important than its polarization component; (2) The addition of more impact factors does not bring a continuous improvement in model performance, but the optimal factor combinations differ for different machine learning retrieval models; (3) The RF model trained using the IM12 combination demonstrates better performance than SVM and ANN in retrieving SM information, with a coefficient of determination (R2) of 0.55 and a root mean square error of 6.12 vol% on the validation set. The level of SM in the Yellow River National Wetland Park is higher than that of the surrounding areas, with substantial seasonal changes. Precipitation, temperature, and vegetation significantly influence the regional variations in SM at the macroscopic level.
Read full abstract