Abstract

With the continuous development of deep learning, video frame prediction has become a hotspot in the field of computer vision due to its wide range of applications in anomaly detection, robot decision-making, weather forecasting, and autonomous driving. Although current video frame prediction methods have made remarkable progress, the majority of them directly generate prediction frames by extracting potential spatial distribution patterns from the video data. They lack spatiotemporal information modeling, which leads to high latency, ambiguity, and unrealistic results. In this work, we propose an end-to-end video prediction network model (Generative Differential-Assisted Discriminative Network, abbreviated as GDDNet). It combines the advantages of the difference generation method to extract short-term variations from the image and attention mechanisms to recall global contextual motion information. Furthermore, the differential attention mechanism (DAM) module can guide the model to allocate attention resources more efficiently. These strategies considerably improve the model’s ability to represent motion features in video frames. To further optimize the prediction effect, we introduce adversarial training to enhance the clarity and authenticity of the video frames. In order to ensure the consistency of spatiotemporal distribution between predicted and real frames, we introduce a sequential frame discriminator. Experimental results on the KITTI, UCF-101, and Caltech pedestrian datasets demonstrate the effectiveness of the GDDNet and compare it to the state-of-the-art model. Multi-frame prediction and ablation experiments show that our proposed model not only improves the quality of predictions, but also provides a more flexible prediction framework.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.