Abstract

This paper presents a highly flexible video coding scheme, based on the use of a redundant dictionary of spatio-temporal three-dimensional (3-D) functions. Directionality and anisotropic scaling are key ingredients to the spatial components, which form a rich collection of two-dimensional (2-D) visual primitives. The temporal component is tuned to capture most of the energy in the temporal signal evolution, along motion trajectories in the video sequence. The video coding scheme (MP3D) first computes motion trajectories that are eventually entropy coded and sent as side information to the decoder. It then applies a spatio-temporal decomposition along motion trajectories, using an adaptive approximation algorithm based on matching pursuit (MP). Quantized coefficients and basis function parameters are entropy-coded in a embedded stream that is constructed to respect multiple rate constraints. The geometric properties of the 2-D primitive dictionary allow for flexible spatial resolution adaptation, so that the flexible MP3D stream enables decoding at different spatio-temporal resolutions, and multiple rates. The MP3D scheme is shown to provide rate-distortion performance that are comparable with state-of-the-art schemes, such as H.264, MPEG-4, at low and medium bit rate. However, the use of a redundant dictionary is penalizing at high coding rate, which makes the MP3D algorithm mostly interesting for low rate applications, or as a flexible base layer in hierarchical coding schemes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call