Abstract

Video frame prediction is both challenging and critical for computer vision. Though the research on predicting video frames has gradually shifted from pixel-law based methods to motion based ones, existing predictors often generate ambiguous future frames, especially for long-term predictions. This paper proposes a composed model to generate future frames with more details. First, to further exploit motion information, we design a single motion decoder to strengthen the efficiency of the motion encoder in the original motion-content network (MCnet). Second, to alleviate prediction ambiguousness, we use both edges with and without semantic meanings from the holistically-nested edge detection (HED) module as content details. Third, based on the conclusion that the mean squared error (MSE) loss and the traditional generative adversarial learning framework cause the unsatisfied predictions of MCnet, we design a composite loss function that can guide our model to simultaneously focus on motions and content details. Also, based on the abovementioned conclusion, we finally embed our model in an improved generative adversarial network, which further enhances its performance. Experimental results on the benchmark KTH and UCF101 datasets show that our model outperforms the state-of-the-art predictors, such as the basic MCnet, the predictive neural network (PredNet), and the PredNet with a reduced-gate convolutional network (rgc-PredNet), in terms of peak signal to noise ratio (PSNR) and structural similarity index measure (SSIM), especially for long-term video frame prediction.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.