The thermal band of a satellite platform enables the measurement of land surface temperature (LST), which captures the spatial-temporal distribution of energy exchange between the Earth and the atmosphere. LST plays a critical role in simulation models, enhancing our understanding of physical and biochemical processes in nature. However, the limitations in swath width and orbit altitude prevent a single sensor from providing LST data with both high spatial and high temporal resolution. To tackle this challenge, the unmixing-based spatiotemporal fusion model (STFM) offers a promising solution by integrating data from multiple sensors. In these models, the surface reflectance is decomposed from coarse pixels to fine pixels using the linear unmixing function combined with fractional coverage. However, when downsizing LST through STFM, the linear mixing hypothesis fails to adequately represent the nonlinear energy mixing process of LST. Additionally, the original weighting function is sensitive to noise, leading to unreliable predictions of the final LST due to small errors in the unmixing function. To overcome these issues, we selected the U-STFM as the baseline model and introduced an updated version called the nonlinear U-STFM. This new model incorporates two deep learning components: the Dynamic Net (DyNet) and the Chang Ratio Net (RatioNet). The utilization of these components enables easy training with a small dataset while maintaining a high generalization capability over time. The MODIS Terra daytime LST products were employed to downscale from 1000 m to 30 m, in comparison with the Landsat7 LST products. Our results demonstrate that the new model surpasses STARFM, ESTARFM, and the original U-STFM in terms of prediction accuracy and anti-noise capability. To further enhance other STFMs, these two deep-learning components can replace the linear unmixing and weighting functions with minor modifications. As a deep learning-based model, it can be pretrained and deployed for online prediction.
Read full abstract