Perspective transform motion modeling for improved side information creation

Pedro Monteiro,Fernando Pereira,João Ascenso

doi:10.1186/1687-6180-2013-189

Abstract

The distributed video coding (DVC) paradigm is based on two well-known information theory results: the Slepian-Wolf and Wyner-Ziv theorems. In a DVC codec, the video signal correlation is mostly exploited at the decoder, providing a flexible distribution of the computational complexity between the encoder and the decoder and error robustness to channel errors. To exploit the temporal correlation, an estimate of the original frame to code, well-known as side information, is typically created at the decoder. One popular approach to side information creation is to perform frame interpolation using a translational motion model derived from already decoded frames. However, this translational model fails to estimate complex camera motions, such as zooms and rotations, and is not accurate enough to estimate the true trajectories of scene objects. In this paper, a new side information creation framework integrating perspective transform motion modeling is proposed. This solution is able to better locally track the trajectories and deformations of each object and increase the accuracy of the overall side information estimation process. Experimental results show peak signal-to-noise ratio gains of up to 1 dB in side information quality and up to 0.5 dB in rate-distortion performance for some video sequences regarding state-of-the-art alternative solutions.

Highlights

Nowadays, image, video, and audio digital coding technologies are widely used by a significant amount of the world population
Other possible use is frame rate up-conversion, notably when the encoder drops some frames to save bitrate to meet the constraints of a bandwidth-limited channel; in this case, the decoder can still obtain a reliable estimate of the lost frame, minimizing the error propagation in a predictive group of pictures (GOP) structure
This paper proposes a novel side information creation framework that exploits a perspective transform motion model to more accurately represent the temporal correlation between the video frames, obtaining better SI quality when compared with the typical translation motion model SI creation solutions

Summary

Introduction

Image, video, and audio digital coding technologies are widely used by a significant amount of the world population. Perspective transform initialization - In this step, the initialization of the perspective transform estimation is performed by assigning to the four corners of each block in the forward reference frame the (same) motion vector calculated with the translational backward motion estimation algorithm, obtaining the initial perspective transform vectors In this deformation, the warped quadrilateral corresponds to a square block (see Figure 4a) in the same position and size as the displaced block calculated by (translational) motion estimation. The major difference is that this module receives as input a perspective transform for each SI block and no longer translational motion vectors (as in the forward and backward perspective transform estimation modules) and refines the initial transforms to obtain better SI quality This module is able to correct some of the errors and inaccuracies made in the algorithm presented in the previous section which selects and creates perspective transforms for the SI blocks based on the perspective transforms obtained between the reference frames. P; if MADp < MADt−α T ; if MADp ≥ MADt−α ð15Þ where α is the penalty offset, φsi represents the selected motion modeling mode for each SI block, and T , P represent the translational and perspective motion modeling modes, respectively

Motion compensation and warping

Findings

Conclusions