Abstract

In modern database applications the similarity or dissimilarity of complex objects is examined by performing distance-based queries (DBQs)on data of high dimensionality. The R-tree and its variations are commonly cited multidimensional access methods that can be used for answering such queries. Although, the related algorithms work well for low-dimensional data spaces, their performance degrades as the number of dimensions increases (dimensionality curse). In order to obtain accept- able response time in high-dimensional data spaces, algorithms that ob- tain approximate solutions can be used. Three approximation techniques (α-allowance, N-consider and M-consider) and the respective recursive branch-and-bound algorithms for DBQs are presented and studied in this paper. We investigate the performance of these algorithms for the most representative DBQs (the K-nearest neighbors query and the K-closest pairs query) in high-dimensional data spaces, where the point data sets are indexed by tree-like structures belonging to the R-tree family:R*- trees and X-trees. The searching strategy is tuned according to several parameters, in order to examine the trade-off between cost (I/O activ- ity and response time) and accuracy of the result. The outcome of the experimental evaluation is the derivation of the outperforming DBQ ap- proximate algorithm for large high-dimensional point data sets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call