Modeling high-dimensional index structures using sampling

Christian A Lang,Ambuj K Singh

doi:10.1145/376284.375716

Abstract

A large number of index structures for high-dimensional data have been proposed previously. In order to tune and compare such index structures, it is vital to have efficient cost prediction techniques for these structures. Previous techniques either assume uniformity of the data or are not applicable to high-dimensional data. We propose the use of sampling to predict the number of accessed index pages during a query execution. Sampling is independent of the dimensionality and preserves clusters which is important for representing skewed data. We present a general model for estimating the index page layout using sampling and show how to compensate for errors. We then give an implementation of our model under restricted memory assumptions and show that it performs well even under these constraints. Errors are minimal and the overall prediction time is up to two orders of magnitude below the time for building and probing the full index without sampling.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Modeling high-dimensional index structures using sampling

Abstract

Talk to us

Similar Papers

More From: ACM SIGMOD Record

Lead the way for us

Journal: ACM SIGMOD Record	Publication Date: May 1, 2001
Citations: 4

Similar Papers

Modeling high-dimensional index structures using sampling
Christian A Lang ... Ambuj K Singh
-
Christian A Lang, et. al.Christian A Lang ... Ambuj K Singh
01 May 2001
01 May 2001

A new index structure combines a cluster algorithm with block distance
Lifang Yang ... Fengfeng Duan
-
Lifang Yang, et. al.Lifang Yang ... Fengfeng Duan
01 Oct 2015
01 Oct 2015

Efficient indexing structures for fast media search and browsing
Marco Teixeira ... Joao Magalhaes
-
Marco Teixeira, et. al.Marco Teixeira ... Joao Magalhaes
01 Jun 2011
01 Jun 2011

PK-Tree: A Spatial Index Structure for High Dimensional Point Data
Wei Wang ... Richard Muntz
-
Wei Wang, et. al.Wei Wang ... Richard Muntz
01 Jan 1999
01 Jan 1999

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Modeling high-dimensional index structures using sampling

Abstract

Talk to us

Similar Papers

More From: ACM SIGMOD Record