Cloud image prediction is a spatio-temporal sequence prediction task, similar to video prediction. Spatio-temporal sequence prediction involves learning from historical data and using the learned features to generate future images. In this process, the changes in time and space are crucial for spatio-temporal sequence prediction models. However, most models now rely on stacking convolutional layers to obtain local spatial features. In response to the complex changes in cloud position and shape in cloud images, the prediction module of the model needs to be able to extract both global and local spatial features from the cloud images. In addition, for irregular cloud motion, more attention should be paid to the spatio-temporal sequence features between input cloud image frames in the temporal sequence prediction module, considering the extraction of temporal features with long temporal dependencies, so that the spatio-temporal sequence prediction network can learn cloud motion trends more accurately. To address these issues, we have introduced an innovative model called SAM-Net. The self-attention module of this model aims to extract an inner image frame’s spatial features of global and local dependencies. In addition, a memory mechanism has been added to the self-attention module to extract interframe features with long temporal and spatial dependencies. Our method shows better performance than the PredRNN-v2 model on publicly available datasets such as MovingMNIST and KTH. We achieved the best performance in both the 4-time-step and 10-time-step typhoon cloud image predictions. On a cloud dataset consisting of 10 time steps, we observed a decrease in MSE of 180.58, a decrease in LPIPS of 0.064, an increase in SSIM of 0.351, and a significant improvement in PSNR of 5.56 compared to PredRNN-v2.
Read full abstract