Abstract

The amount of digital contents grows at a faster speed as a result does the demand for communicate them. On the other hand, the amount of storage and bandwidth increases at a slower rate. Thus powerful and efficient compression methods are required. The repetition of words and phrases cause the reordered text much more compressible than the original text. On the whole system is fast and achieves close to the best result on the test files. In this study a novel fast dictionary based text compression technique MBRH (Multidictionary with burrows wheeler transforms, Run length coding and Huffman coding) is proposed for the purpose of obtaining improved performance on various document sizes. MBRH algorithm comprises of two stages, the first stage is concerned with the conversion of input text into dictionary based compression .The second stage deals mainly with reduction of the redundancy in multidictionary based compression by using BWT, RLE and Huffman coding. Bib test files of input size of 111, 261 bytes achieves compression ratio of 0.192, bit rate of 1.538 and high speed using MBRH algorithm. The algorithm has attained a good compression ratio, reduction of bit rate and the increase in execution speed.

Highlights

  • Data compression is the method representing information in a compact form

  • MBRH algorithm comprises of two stages, the first stage is concerned with the conversion of input text into dictionary based compression .The second stage deals mainly with reduction of the redundancy in multidictionary based compression by using Burrows-Wheeler Transform (BWT), Run Length Encoding (RLE) and Huffman coding

  • Compression algorithms require large execution time, memory size because of the presence of large number of alphabets in original source code (Carus and Mesut, 2010).Text compression coding can be categorized into two groups; statistical based coding and dictionary based coding

Read more

Summary

INTRODUCTION

Data compression is the method representing information in a compact form. It decreases the number of bits required to represent a data. Dictionary-based methods are popular in the data compression domain (Begum and Venkataramani, 2012; Mohan and Govindan, 2005; Sun et al, 2003). Arithmetic coding and PPM are examples of statistics based coding In this coding scheme the symbols are coded to variable lengths. Arithmetic coding and PPM are examples of statisticsl based coding (Sayood, 2012) In this coding scheme variable length of code used for symbols. These high redundant texts increase the performance of some text compression algorithms. The words in the input text are transformed with highly redundant codes by an approach known as multidictionary based text compression. The performance in terms of compression ratio is satisfactory. a more efficient algorithm will give still better results

Dictionary Formation
Multidictionary Generation
Run Length Coding
Huffman Coding
Decoding Algorithm
DISCUSSION
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.