Abstract

Efficient encoding of motion residuals is essential for low-delay video applications in which videos are encoded by hybrid motion compensation and a residual encoding structure. Current standard video coding systems use hybrid motion compensation and the discrete cosine transform (DCT), where the number of bases needed to encode a residual block is the same as the number of the pixels in the block. An encoded frame is predicted from its previous reconstructed frame, and a residual image is then encoded by a non-redundant transformation, such as the DCT, or an approximation of the DCT using integer coefficients. As well as nonredundant transformation, a frame-based technique called matching pursuit (MP) has been proposed to encode motion residual images. Mallat and Zhang (Mallat & Zhang, 1993) were the first to propose a matching pursuit algorithm that decomposes a signal into a linear combination of bases within an overcomplete dictionary. Vetterli and Kalker have translated motion compensation and DCT hybrid video coding into a matching pursuit technique, and encoded frames by the matching pursuit algorithm and a dictionary composed of motion blocks and DCT bases (Vetterli & Kalker, 1994). In (Neff & Zakhor, 1997), Neff and Zakhor show that using a matching pursuit algorithm to encode motion residual images achieves a better performance than a DCT-based algorithm in terms of PSNR and perceptual quality at very low bit rates. The results in (Lin et al., 2005) also demonstrate that the matching pursuit FGS coding scheme performs better than MPEG-4 FGS at very low bit rates. Unlike a transform-based decoder, a matching pursuit decoder does not require an inverse transform; therefore, it is less complex. In a transform-based decoder, loop filtering and post processing are usually applied at very low bit rates to remove blocking and ringing artifacts, whereas a matching pursuit decoder can achieve comparable quality without such filtering and processing (Neff et al., 1998). Because the matching pursuit algorithm is a data-dependent frame-based representation, a matching pursuit video coding technique cannot be directly translated from conventional transform-based approaches. We must therefore develop a new matching pursuit video coding technique that can deal with quantization noise in the matching pursuit algorithm (Neff & Zakhor, Sept. 2000; Vleeschouwer & Zakhor, 2002), multiple description coding for reliable transmission (Tang & Zakhor, 2002), scalable bitstream generation (Al-Shaykh et al., 1999; Vleeschouwer & Macq, 2000; Rose & Regunathan, 2001), and dictionary learning and adaptation (Engan et al., 2000).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.