Abstract
Understanding and predicting global soil moisture (SM) is crucial for water resource management and agricultural production. While deep learning methods (DL) have shown strong performance in SM prediction, imbalances in training samples with different characteristics pose a significant challenge. We propose that improving the diversity and balance of batch training samples during gradient descent can help address this issue. To test this hypothesis, we developed a Cluster-Averaged Sampling (CAS) strategy utilizing unsupervised learning techniques. This approach involves training the model with evenly sampled data from different clusters, ensuring both sample diversity and numerical consistency within each cluster. This approach prevents the model from overemphasizing specific sample characteristics, leading to more balanced feature learning. Experiments using the LandBench1.0 dataset with five different seeds for 1-day lead-time global predictions reveal that CAS outperforms several Long Short-Term Memory (LSTM)-based models that do not employ this strategy. The median Coefficient of Determination (R2) improved by 2.36 % to 4.31 %, while Kling-Gupta Efficiency (KGE) improved by 1.95 % to 3.16 %. In high-latitude areas, R2 improvements exceeded 40 % in specific regions. To further validate CAS under realistic conditions, we tested it using the Soil Moisture Active and Passive Level 3 (SMAP-L3) satellite data for 1 to 3-day lead-time global predictions, confirming its efficacy. The study substantiates the CAS strategy and introduces a novel training method for enhancing the generalization of DL models.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.