Abstract
Data compression has been widely used in many Information Retrieval based applications like web search engines, digital libraries, etc. to enable the retrieval of data to be faster. In these applications, universal codes (Elias codes (EC), Fibonacci code (FC), Rice code (RC), Extended Golomb code (EGC), Fast Extended Golomb code (FEGC) etc.) have been preferably used than statistical codes (Huffman codes, Arithmetic codes etc). Universal codes are easy to be constructed and decoded than statistical codes. In this paper, the authors have proposed two methods to construct universal codes based on the ideas used in Rice code and Fast Extended Golomb Code. One of the authors’ methods, Re-ordered FEGC, can be suitable to represent small, middle and large range integers where Rice code works well for small and middle range integers. It is also competing with FC, EGC and FEGC in representing small, middle and large range integers. But it could be faster in decoding than FC, EGC and FEGC. The authors’ another coder, Block based RFEGC, uses local divisor rather than global divisor to improve the performance (both compression and decompression) of RFEGC. To evaluate the performance of the authors’ coders, the authors have applied their methods to compress the integer values of the inverted files constructed from TREC, Wikipedia and FIRE collections. Experimental results show that their coders achieve better performance (both compression and decompression) for those files which contain significant distribution of middle and large range integers.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Information Retrieval Research
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.