Abstract

k-closest pair (KCP for short) search is a fundamental problem in database research. Given a set of d-dimensional streaming data S, KCP search aims to retrieve k pairs with the shortest distances between them. While existing works have studied continuous 1-closest pair query (i.e., k=1) over dynamic data environments, which allow for object insertions/deletions, they require high computational costs and cannot easily support KCP search with k>1. This paper investigates the problem of KCP search over data stream, aiming to incrementally maintain as few pairs as possible to support KCP search with arbitrarily k. To achieve this, we introduce the concept of NNS (short for N earest N eighbour pair- S et), which consists of all the nearest neighbour pairs and allows us to support KCP search via only accessing O(k) objects. We further observe that in most cases, we only need to use a small portion of NNS to answer KCP search as typically kłl n. Based on this observation, we propose TNNS (short for T hreshold-based NN pair S et), which contains a small number of high-quality NN pairs, and a partition named τ-DLBP (short for τ- D istance L ower- B ound based P artition) to organize objects, with τ being an integer significantly smaller than n. τ-DLBP organizes objects using up to O(łog n / τ) partitions and is able to support the construction and update of TNNS efficiently.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call