Abstract

ABSTRACTCompression techniques are essential for those applications which require more disk accesses. Since the uncompressed data need more disk accesses than the compressed data, compression is used for reducing the costs of disk access to enable the retrieval of data to be faster. Various variable length codes, such as Elias codes, Rice code, and fast extended Golomb code have been used in many applications to compress the data. Particularly, these codes have been used in information retrieval-based applications to compress integers. In these applications, integers are the basis of indexes that are used to resolve queries. This paper has proposed a new method to represent non-negative integers based on the idea used in Rice code and fast extended Golomb code. The variable length codes produced by the proposed method can be suitable for representing small, middle, and large range of integers, where Rice code suits well for representing small or middle or large range of integers. This method also gives better representation for most of the integers from small-to-large range than fast extended Golomb code. In this paper, the proposed method has been applied to compress the coordinates (integers) used in the R-tree structure, which is used for indexing the spatial data. In the experiments, TIGER data collections and synthetic data collections have been used to evaluate the compression performance. The experimental results show that our code achieves better bit-rate than other existing codes for those spatial data files, which contain significant distribution of small, middle, and large integers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call