A NOVEL MULTIDICTIONARY BASED TEXT COMPRESSION

Begum Begum

doi:10.3844/jcssp.2012.1940.1945

Abstract

The amount of digital contents grows at a faster speed as a result does the demand for communicate them. On the other hand, the amount of storage and bandwidth increases at a slower rate. Thus powerful and efficient compression methods are required. The repetition of words and phrases cause the reordered text much more compressible than the original text. On the whole system is fast and achieves close to the best result on the test files. In this study a novel fast dictionary based text compression technique MBRH (Multidictionary with burrows wheeler transforms, Run length coding and Huffman coding) is proposed for the purpose of obtaining improved performance on various document sizes. MBRH algorithm comprises of two stages, the first stage is concerned with the conversion of input text into dictionary based compression .The second stage deals mainly with reduction of the redundancy in multidictionary based compression by using BWT, RLE and Huffman coding. Bib test files of input size of 111, 261 bytes achieves compression ratio of 0.192, bit rate of 1.538 and high speed using MBRH algorithm. The algorithm has attained a good compression ratio, reduction of bit rate and the increase in execution speed.

Highlights

Data compression is the method representing information in a compact form
MBRH algorithm comprises of two stages, the first stage is concerned with the conversion of input text into dictionary based compression .The second stage deals mainly with reduction of the redundancy in multidictionary based compression by using Burrows-Wheeler Transform (BWT), Run Length Encoding (RLE) and Huffman coding
Compression algorithms require large execution time, memory size because of the presence of large number of alphabets in original source code (Carus and Mesut, 2010).Text compression coding can be categorized into two groups; statistical based coding and dictionary based coding

Summary

INTRODUCTION

Data compression is the method representing information in a compact form. It decreases the number of bits required to represent a data. Dictionary-based methods are popular in the data compression domain (Begum and Venkataramani, 2012; Mohan and Govindan, 2005; Sun et al, 2003). Arithmetic coding and PPM are examples of statistics based coding In this coding scheme the symbols are coded to variable lengths. Arithmetic coding and PPM are examples of statisticsl based coding (Sayood, 2012) In this coding scheme variable length of code used for symbols. These high redundant texts increase the performance of some text compression algorithms. The words in the input text are transformed with highly redundant codes by an approach known as multidictionary based text compression. The performance in terms of compression ratio is satisfactory. a more efficient algorithm will give still better results

Dictionary Formation

Multidictionary Generation

Run Length Coding

Huffman Coding

Decoding Algorithm

DISCUSSION

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Computer Science	Publication Date: Dec 1, 2012
Citations: 3	License type: cc-by

R Discovery Prime

R Discovery Prime

A NOVEL MULTIDICTIONARY BASED TEXT COMPRESSION

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science

Lead the way for us

Similar Papers

Trigram-Based Vietnamese Text Compression
Vu H Nguyen ... Vaclav Snasel
-
Vu H Nguyen, et. al.Vu H Nguyen ... Vaclav Snasel
01 Jan 2015
01 Jan 2015

Interleaving of Delay Fault Tes Data for Efficient Test Compression with Statistical Coding
...
-
, et. al. ...
20 Nov 2006
20 Nov 2006

Hybrid data compression using fuzzy logic and Huffman coding in secure IOT
...
Iranian Journal of Fuzzy Systems | VOL. 18
, et. al. ...
01 Feb 2021
Iranian Journal of Fuzzy Systems | VOL. 18

Radiological image compression using error-free irreversible two-dimensional direct-cosine-transform coding techniques.
H K Huang ... Bruce K Ho
Journal of the Optical Society of America A | VOL. 4
H K Huang, et. al.H K Huang ... Bruce K Ho
01 May 1987
Journal of the Optical Society of America A | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A NOVEL MULTIDICTIONARY BASED TEXT COMPRESSION

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Computer Science