An encoding-based dual distance tree high-dimensional index

Yi Zhuang,Fei Wu,Yueting Zhuang

doi:10.1007/s11432-008-0104-3

Abstract

The paper proposes a novel symmetrical encoding-based index structure, which is called EDD-tree (for encoding-based dual distance tree), to support fast k-nearest neighbor (k-NN) search in high-dimensional spaces. In the EDD-tree, all data points are first grouped into clusters by a k-means clustering algorithm. Then the uniform ID number of each data point is obtained by a dual-distance-driven encoding scheme, in which each cluster sphere is partitioned twice according to the dual distances of start-and centroid-distance. Finally, the uniform ID number and the centroid-distance of each data point are combined to get a uniform index key, the latter is then indexed through a partition-based B+-tree. Thus, given a query point, its k-NN search in high-dimensional spaces can be transformed into search in a single dimensional space with the aid of the EDD-tree index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of our proposed scheme, and the results demonstrate that this method outperforms the state-of-the-art high-dimensional search techniques such as the X-tree, VA-file, iDistance and NB-tree, especially when the query radius is not very large.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An encoding-based dual distance tree high-dimensional index

Abstract

Talk to us

Similar Papers

More From: Science in China Series F: Information Sciences

Lead the way for us

Similar Papers

Indexing high-dimensional data in dual distance spaces
Yi Zhuang ... Qing Li
-
Yi Zhuang, et. al.Yi Zhuang ... Qing Li
25 Mar 2008
25 Mar 2008

Composite Distance Transformation for Indexing and k-Nearest-Neighbor Searching in High-Dimensional Spaces
Yi Zhuang ... Yue-Ting Zhuang
Journal of Computer Science and Technology | VOL. 22
Yi Zhuang, et. al.Yi Zhuang ... Yue-Ting Zhuang
01 Mar 2007
Journal of Computer Science and Technology | VOL. 22

IPoc: A Polar Coordinate Based Indexing Method for Nearest Neighbor Search in High Dimensional Space
Zhang Liu ... Jianmin Wang
-
Zhang Liu, et. al.Zhang Liu ... Jianmin Wang
01 Jan 2009
01 Jan 2009

Novel approach for nearest neighbor search in high dimensional space
Ming Zhang ... Reda Alhajj
-
Ming Zhang, et. al. Ming Zhang ... Reda Alhajj
01 Sep 2008
01 Sep 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An encoding-based dual distance tree high-dimensional index

Abstract

Talk to us

Similar Papers

More From: Science in China Series F: Information Sciences