Abstract

The intention of shape coding in the MPEG-4 is to improve the coding efficiency as well as to facilitate the object-oriented applications, such as shape-based object recognition and retrieval. These require both efficient shape compression and effective shape description. Although these two issues have been intensively investigated in data compression and pattern recognition fields separately, it remains an open problem when both objectives need to be considered together. To achieve high coding gain, the operational rate-distortion optimal framework can be applied, but the direction restriction of the traditional eight-direction edge encoding structure reduces its compression efficiency and description effectiveness. We present two arbitrary direction edge encoding structures to relax this direction restriction. They consist of a sector number, a short component, and a long component, which represent both the direction and the magnitude information of an encoding edge. Experiments on both shape coding and hand gesture recognition validate that our structures can reduce a large number of encoding vertices and save up to 48.9% bits. Besides, the object contours are effectively described and suitable for the object-oriented applications. © The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI. (DOI: 10.1117/1.JEI.23.4.043009)

Highlights

  • To facilitate the applications of object-oriented storage, retrieval, editing, and interaction, modern multimedia communications require that video content has to be accessible on an object basis

  • To analyze the performance of the operational rate-distortion (ORD) optimal framework with our two arbitrary direction edge encoding structures in both data compression and pattern recognition fields, various ORD optimal shape coding algorithms with different parameter configurations are applied to both shape coding and hand gesture recognition

  • Admissible vertex band (AVB) type refers to whether the AVB of width 1 pel is used; Edge encoding structure type refers to the choice of 8-direction, 8-sector, and 16-sector structure; and Code table type refers to the choice of run length codes (RLC) and variable length codes (VLC)

Read more

Summary

Introduction

To facilitate the applications of object-oriented storage, retrieval, editing, and interaction, modern multimedia communications require that video content has to be accessible on an object basis. Because of severe mobile environments and massive image and video retrieval demands, a good shape coding scheme has to provide an efficient compression as well as effective description.[1] These two requirements have been extensively investigated in data compression[2,3,4] and pattern recognition fields[5,6,7] separately in two recent decades. Jointly considering both objectives in one framework remains an open problem. The other is the operational rate-distortion (ORD) optimal framework.[1,8,9] It jointly considers the vertex selection and encoding as a shortest path problem in a directed acyclic graph (DAG); it can guarantee the optimality in the RD sense

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call