Layered Solution for SLCA Problem in XML Information Retrieval

Ling-Bo Kong

doi:10.1360/jos180919

Abstract

SLCA (smallest lowest common ancestor) problem is a basic task of keyword search in XML information retrieval. It means to find all the nodes corresponding to the tightest subtrees in XML data, which involves the given keywords. Xu, et al., illustrate three algorithms—Indexed lookup eager (ILE), stack algorithm and scan eager (SE), and manifest that ILE has the best performance. Different from the complicated-B+-tree-based ILE algorithm, this paper proposes a layered solution for SLCA problem, named as LISA (layered intersection scan algorithm). It benefits from the distribution rule of SLCA nodes in XML tree, and calculates the SLCA nodes level by level (the deepest level runs first). That is, based on the retrieved Dewey codes corresponding to given keywords, the Dewey codes of SLCA nodes can be gotten by intersecting the prefix Dewey codes of each level. Compared with the ILE algorithm, LISA solutions need not sophisticated data structures, and have comparatively runtime performance. There are two instances following the LISA idea, called LISA I and LISA II respectively. They are distinguished from each other according to whether keeping Dewey codes in computation or transforming Dewey codes into integer sequences. Extensive experiments evaluate the performance of algorithms and prove the efficiency of LISA II.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Layered Solution for SLCA Problem in XML Information Retrieval

Abstract

Talk to us

Similar Papers

More From: Journal of Software

Lead the way for us

Journal: Journal of Software	Publication Date: Jan 1, 2007
Citations: 27

Similar Papers

Effective XML keyword query processing
Prashant R Lambole ... Prashant N Chatur
-
Prashant R Lambole, et. al.Prashant R Lambole ... Prashant N Chatur
01 Apr 2017
01 Apr 2017

IMBBTC: XML Document Indexing Model Based on Binary Tree Coding
Zhixin Hu
-
Zhixin HuZhixin Hu
01 Jan 2015
01 Jan 2015

Efficient SLCA-Based Keyword Search on XML Databases: An Iterative-Skip Approach
Jiaqi Zhang ... Wei Wang
-
Jiaqi Zhang, et. al.Jiaqi Zhang ... Wei Wang
01 Jan 2009
01 Jan 2009

Searching XML data by SLCA on a MapReduce cluster
Mengjie Zhou ... Haoji Hu
-
Mengjie Zhou, et. al.Mengjie Zhou ... Haoji Hu
01 Oct 2010
01 Oct 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Layered Solution for SLCA Problem in XML Information Retrieval

Abstract

Talk to us

Similar Papers

More From: Journal of Software