A new indexing method with high storage utilization and retrieval efficiency for large spatial databases

Hung-Yi Lin,Po-Whei Huang,Kuang-Hua Hsu

doi:10.1016/j.infsof.2006.09.005

Abstract

Storing and querying high-dimensional data are important problems in designing an information retrieval system. Two crucial issues, time and space efficiencies, must be considered when evaluating the performance of such a system. The KDB-tree and its variants have been reported to have good performance by using them as the index structure for retrieving multidimensional data. However, they all suffer from low storage utilization problem caused by imperfect “splitting policies.” Unnecessary splits increase the size of the index structure and deteriorate the performance of the system. In this paper, a new data insertion algorithm with a better splitting policy was proposed, which arranges data entries in the leaf nodes as many as possible. Our new index scheme can increase the storage utilization up to nearly 100% and reduce the index size to a smaller scale. As a result, both time and space efficiencies are significantly improved. Analytical and experimental results show that our indexing method outperforms the traditional KDB-tree and its variants.

Full Text