Dynamic with Dictionary Technique for Arabic Text Compression

Fatima Thaher

doi:10.5120/ijca2016908299

Fatima Thaher

Open Access

PDF Available

https://doi.org/10.5120/ijca2016908299

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

In this research paper we build a new, reliable, and sufficient algorithm for Arabic text language. The proposed algorithm should combine the features of the Huffman and Lempel Ziv algorithms, and is expected be able to reduce the general compression ratio . Our approach is different from Huffman algorithm in the sense that it assigns codes to n-gram symbols where n is a positive integer that is greater than or equal to one. Compared to Huffman algorithm, which assigns a code to each symbol individually, our approach is expected to assign codes to symbols in average. Our approach is different from Lempel Ziv algorithm in the sense that the size of dictionary that we build does not grow in an uncontrolled manner. The size of the dictionary is fixed and its size can be expected prior to process the text files that are to be compressed. This is because the size of each word in the dictionary we build is fixed and is equal to n. So for example, given that the number of different symbols in the text file at hand is m and that n is 2, the total number of entries in the dictionary that we propose to build will be m*m in the worst case.

Full Text