Abstract
Alternative fully DCT-based video codec architectures have been proposed in the past to address the shortcomings of the conventional hybrid motion compensated DCT video codec structures traditionally chosen as the basis of implementation of standard-compliant codecs. However, no prior effort has been made to ensure interoperability of these two drastically different architectures so that fully DCT-based video codecs are fully compatible with the existing video coding standards. In this paper, we establish the criteria for matching conventional codecs with fully DCT-based codecs. We find that the key to this interoperability lies in the heart of the implementation of motion compensation modules performed in the spatial and transform domains at both the encoder and the decoder. Specifically, if the spatial-domain motion compensation is compatible with the transform-domain motion compensation, then the states in both the coder and the decoder will keep track of each other even after a long series of P-frames. Otherwise, the states will diverge in proportion to the number of P-frames between two I-frames. This sets an important criterion for the development of any DCT-based motion compensation schemes. We also discuss and develop some DCT-based motion compensation schemes as important building blocks of fully DCT-based codecs. For the case of subpixel motion compensation, DCT-based approaches allow more accurate interpolation without any increase in computation. Furthermore, a scare number of DCT coefficients after quantization significantly decreases the number of calculations required for motion compensation. Coupled with the DCT-based motion estimation algorithms, it is possible to realize fully DCT-based codecs to overcome the disadvantages of conventional hybrid codecs.
Highlights
In most international video coding standards such as CCITT H.261 [1], MPEG-1 [2], MPEG-2 [3] as well as the proposed HDTV standard, Discrete Cosine Transform (DCT) and block-based motion estimation are the essential elements to achieve spatial and temporal compression, respectively
The feedback loop in the coder for temporal prediction consists of a DCT, an Inverse DCT (IDCT), a spatial-domain motion compensator (SD-motion compensated frame (MC)), and a spatial-domain motion estimator (SD-ME) which is usually the full search block matching approach (BKM)
The presence of the IDCT block inside the feedback loop of the conventional video coder design comes from the fact that currently available motion estimation algorithms can only estimate motion in the spatial domain rather than directly in the DCT domain
Summary
In most international video coding standards such as CCITT H.261 [1], MPEG-1 [2], MPEG-2 [3] as well as the proposed HDTV standard, Discrete Cosine Transform (DCT) and block-based motion estimation are the essential elements to achieve spatial and temporal compression, respectively. In the case of the conventional decoder, motion compensation is carried out in the spatial domain after conversion of compressed bit streams by the IDCT block back to the uncompressed reconstructed pixels This implies the requirement for such a decoder to handle all the image pixels in real time even after being encoded at a very high compression rate. Higher throughput rate: the feedback loop of a video coder requires processing at the frame rate so that the previous frame data can be stored in the frame memory and need to be available for coding the incoming frame This loop has four components plus the spatialdomain motion estimation and compensation unit and creates the bottleneck for encoding large frame sizes in real time.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.