Abstract

This article proposes a video prediction network called STMP-Net that addresses the problem of the inability of Recurrent Neural Networks (RNNs) to fully extract spatiotemporal information and motion change features during video prediction. STMP-Net combines spatiotemporal memory and motion perception to make more accurate predictions. Firstly, a spatiotemporal attention fusion unit (STAFU) is proposed as the basic module of the prediction network, which learns and transfers spatiotemporal features in both horizontal and vertical directions based on spatiotemporal feature information and contextual attention mechanism. Additionally, a contextual attention mechanism is introduced in the hidden state to focus attention on more important details and improve the capture of detailed features, thus greatly reducing the computational load of the network. Secondly, a motion gradient highway unit (MGHU) is proposed by combining motion perception modules and adding them between adjacent layers, which can adaptively learn the important information of input features and fuse motion change features to significantly improve the predictive performance of the model. Finally, a high-speed channel is provided between layers to quickly transmit important features and alleviate the gradient vanishing problem caused by back-propagation. The experimental results show that compared with mainstream video prediction networks, the proposed method can achieve better prediction results in long-term video prediction, especially in motion scenes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.