Abstract

Metric trees are designed for improving efficiency of similarity search in high dimensional data. Searching a metric tree always terminates at leaf nodes, restricting the hierarchical search to a certain leaf level, and the linear search in leafs to a particular leaf size. This may result in performing additional unneeded distance comparisons, which increases the search time as the number of dimensions increases. It is proposed that the search adapts itself to the query in question, so as to avoid unneeded distance comparisons. Thus, the hierarchical search should terminate when no more search- space pruning can be useful, and search then continues linearly through the un-pruned data space. Hierarchical search can thus stop any where above or below the original leaf level of a metric tree. Search rules are proposed to adapt the search to the query parameters. A modification to the metric tree is also suggested to adopt the proposed rules. High dimensional gene expression data sets are used to evaluate the new algorithm, showing speed ups of 55% compared to traditional search.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call