TJJE: An efficient algorithm for top-k join on massive data

Xixian Han,Jianzhong Li,Jinbao Wang,Donghua Yang

doi:10.1016/j.ins.2012.08.013

Abstract

In many applications, top-k join is an important operation to return the k most important join tuples among the potentially huge answer space according to a given ranking function. PBRJ is an algorithm template that generalizes previous top-k join algorithms. In this paper, our analysis shows that PBRJ needs to maintain a large quantity of candidate tuples on massive data. Based on the analysis, this paper proposes a novel top-k join algorithm TJJE which is suitable for handling massive data. By some pre-computed information, TJJE first estimates an upper-bound on scan depth of each joined table. Then it determines the file that contains the join positional index pairs of the top-k join results. A novel algorithm is proposed to retrieve the required join tuples by a single sequential and selective scan on the joined tables. Finally, the top-k join results are obtained by a single scan on the retrieved join tuples. The correctness proof and cost analysis of TJJE are presented in this paper. Extensive experiments show that TJJE maintains up to three orders of magnitude fewer candidate tuples and obtains up to one order of magnitude speedup compared to PBRJ.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TJJE: An efficient algorithm for top-k join on massive data

Abstract

Talk to us

Similar Papers

More From: Information Sciences

Lead the way for us

Journal: Information Sciences	Publication Date: Aug 21, 2012
Citations: 4

Similar Papers

Assessment of abdominal fat content by computed tomography
Ga Borkan ... De Hults
The American Journal of Clinical Nutrition | VOL. 36
Ga Borkan, et. al.Ga Borkan ... De Hults
01 Jul 1982
The American Journal of Clinical Nutrition | VOL. 36

Ranking the big sky: efficient top-k skyline computation on massive data
Xixian Han ... Bailing Wang
Knowledge and Information Systems | VOL. 60
Xixian Han, et. al.Xixian Han ... Bailing Wang
01 Sep 2018
Knowledge and Information Systems | VOL. 60

Robust classification of sector-scan sonar image sequences
M.J Chantler ... J.P Stoner
-
M.J Chantler, et. al.M.J Chantler ... J.P Stoner
13 Sep 1994
13 Sep 1994

Automatic interpretation of sonar image sequences using temporal feature measures
M.J Chantler ... J.P Stoner
IEEE Journal of Oceanic Engineering | VOL. 22
M.J Chantler, et. al.M.J Chantler ... J.P Stoner
01 Jan 1997
IEEE Journal of Oceanic Engineering | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TJJE: An efficient algorithm for top-k join on massive data

Abstract

Talk to us

Similar Papers

More From: Information Sciences