Arabic-document compression: A close look at group 3 international digital facsimile coding standards

A Kh Al Jabri

doi:10.1016/0920-5489(95)00045-3

Abstract

Efficient bit-representation or compression of documents is an important issue in many applications. The amount of compression depends on the document contents such as written scripts, diagrams, tables, etc. The contents of the document determine the limit of this compression. In the CCITT Recommendation T.4, ‘Standardization of group 3 apparatus for document transmission’, a modified Huffman code was chosen as the standard compression technique [1]. The selection is based on examining documents with contents of different natures. With the cursive nature and the domination of certain shapes in printed Arabic, one may be curious to know the compression efficiency of the chosen standard for documents with printed Arabic contents. For this purpose, more than ten documents containing printed Arabic script have been scanned and analyzed in this paper. Both the entropy, based on the Capon model [5], and the compression rates using the modified Huffman code are calculated. Our results show that the CCITT coding standard seems to be robust for documents with printed Arabic script.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Arabic-document compression: A close look at group 3 international digital facsimile coding standards

Abstract

Talk to us

Similar Papers

More From: Computer Standards & Interfaces

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Arabic-document compression: A close look at group 3 international digital facsimile coding standards

Abstract

Talk to us

Similar Papers

More From: Computer Standards & Interfaces