Abstract

A novel optical flow prediction model using an adaptable deep neural network architecture for blind and non-blind error concealment of videos degraded by transmission loss is presented. The two-stream network model is trained by separating the horizontal and vertical motion fields which are passed through two similar parallel pipelines that include traditional convolutional (Conv) and convolutional long short-term memory (ConvLSTM) layers. The ConvLSTM layers extract temporally correlated motion information while the Conv layers correlate motion spatially. The optical flows used as input to the two-pipeline prediction network are obtained through a flow generation network that can be easily interchanged, increasing the adaptability of the overall end-to-end architecture. The performance of the proposed model is evaluated using real-world packet loss scenarios. Standard video quality metrics are used to compare frames reconstructed using predicted optical flows with those reconstructed using “ground-truth” flows obtained directly from the generator.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call