Abstract

In high-dimensional data domains, the performance of conventional tree-based access structures is occasionally outperformed by simple sequential scans. To this end, the introduction of approximation-based methods helped speed-up queries by providing compact representations of stored data. Approximation methods exploit vector quantization to index data mainly presumed to follow a uniform distribution. In real-world environments however, we mostly encounter both skewed data and query distributions. To address this dual challenge, we propose DiVA that combines the selective use of an approximation approach with an indexing mechanism to organize data subspaces in a high fan-out hierarchical structure. Moreover, DiVA reorganizes its own elements after receiving application hints regarding data access patterns. These hints or policies trigger the restructuring and possible expansion of DiVA so as to offer finer indexing granularity and improved access times in subspaces emerging as ‘hot-spots’. The novelty of our approach lies in the self-organizing nature of DiVA driven by application-provided policies; the latter effectively guide the refinement of DiVA’s elements as new data arrive, existing data are updated and the nature of query workloads continually changes. An extensive experimental evaluation using real data shows that DiVA reduces up-to 64% of the total number of I/Os if compared with state-of-art methods including the VA-file, GC-tree and A-tree.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.