Abstract

The spatial-temporal sequence prediction task is a challenge for neural networks, as it requires capturing temporal and spatial changes simultaneously. Recent researches focus on the modification of the ConvLSTM internal elements, which improves the prediction ability to some extent but introduces a large number of parameters. We observe that although the result generated by the row ConvLSTM is not very accurate in position and have a blurry appearance, it contains enough elements required to reconstruct the prediction. The motivation is to further refine these feature maps. Therefore, we propose a multi-attention LSTM (MA-LSTM) based on dimensional decoupling attention to alleviate the problems of the existing ConvLSTM-based methods, including inaccurate positions prediction and blurry generated results. The proposed model includes the Dimensionality Decoupling Module (D2M) and the Channel Attention Module (CAM). D2M compresses dimensions to transfer motion features over time that squeezes the features by dimensions and calculates the motion distribution of each dimension. CAM maintains the predicted texture information, which integrates latent information between channels and weights of the effective channels. Through the coordination of multiple modes, the model has long-term predictive capability and the accuracy of the prediction results has been improved. Experimental results show that using a smaller number of parameters, the proposed method can obtain competitive results but less inference time compared with the state-of-the-art (SOTA) methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.