Ext-LOUDS: A Space Efficient Extended LOUDS Index for Superset Query

Lianyin Jia,Runxin Li,Jiaman Ding,Jinguo You,Yuna Zhang,Yinong Chen

doi:10.3390/app10238530

Abstract

Superset query is widely used in object-oriented databases, data mining, and many other fields. Trie is an efficient index for superset query, whereas most existing trie index aim at improving query performance while ignoring storage overheads. To solve this problem, in this paper, we propose an efficient extended Level-Ordered Unary Degree Sequence (LOUDS) index: Ext-LOUDS. Ext-LOUDS expresses a trie by 1 integer vector and 3 bit vectors directly map each NodeID to its corresponding position, thus accelerating some key operations needed for superset query. Based on Ext-LOUDS, an efficient superset query algorithm, ELOUDS-Super, is designed. Experimental results on both real and synthetic datasets show that Ext-LOUDS can decrease 50%–60% space overheads compared with trie while maintaining a relative good query performance.

Highlights

With the rapid development in e-commerce, Internet of Things and many other fields, both the scale and complexity of data are increasing
We focus on superset query, that is, given a query set Q, retrieve all subsets of Q in a set dataset D (Q is the superset of these sets)
Perform a SELECT operation on IsFirstChild vector to obtain the starting position pstart and the ending position pend of the child nodes of node indicated by node_num; Perform a binary search to obtain the position p of the current query element Q[level] in Elems vector; If the node corresponding to p is an end node, perform a RANK operation on IsEnd vector to obtain the qualifying sets corresponding to the node, and merge it into the result set; If the node corresponding to p has child nodes, obtain the internalID and execute the algorithm recursively

Summary

Introduction

With the rapid development in e-commerce, Internet of Things and many other fields, both the scale and complexity of data are increasing. To effectively support the set query, trie often needs to be extended with some attributes (e.g., the prefix set of current node [10], the link to the node with the same label [9]) which usually are byte or integer types These pointers and extensions will inevitably increase the overheads of trie, thereby affecting its scalability, especially when extended to large datasets. Some recent works [18,19,20] researched efficient RANK & SELECT operations to improve retrieving performance of LOUDS He et al [21] designed a novel succinct structure that supports the mapping between preorder ranks and level-order ranks of nodes in constant time. Experimental results on two real datasets show that Ext-LOUDS can reduce space overheads by up to 50%–60% without significantly reducing query performance

Set Superset Query

Ext-LOUDS

ELOUDS-Super Algorithm

Algorithm Complexity Analysis

Experimental Environment and Datasets

Real Datasets

Synthetic Datasets

Experiments on Real Datasets

4.2.2.Experiments

Experiments on Synthetic Datasets experiment by fixing

Findings

Conclusions and Future

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Nov 28, 2020
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Ext-LOUDS: A Space Efficient Extended LOUDS Index for Superset Query

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Combination of Partition Table and Grid Index in Large-Scale Spatial Database Query
Bo Wan ... Lin Yang
-
Bo Wan, et. al.Bo Wan ... Lin Yang
01 Jan 2009
01 Jan 2009

Lazy XML updates
Barbara Catania ... Wenqiang Wang
-
Barbara Catania, et. al.Barbara Catania ... Wenqiang Wang
14 Jun 2005
14 Jun 2005

Handling slowly changing dimensions in data warehouses
Sidra Faisal ... Mansoor Sarwar
The Journal of Systems & Software | VOL. 94
Sidra Faisal, et. al.Sidra Faisal ... Mansoor Sarwar
31 Mar 2014
The Journal of Systems & Software | VOL. 94

Semantic Web Data Partitioning
Trupti Padiya ... Sanjay Chaudhary
-
Trupti Padiya, et. al.Trupti Padiya ... Sanjay Chaudhary
01 Jan 2013
01 Jan 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ext-LOUDS: A Space Efficient Extended LOUDS Index for Superset Query

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences