Abstract

Reachability queries plays a crucial role in accessing relationships between nodes in tree-structured data. Previous studies have proposed prime number labeling schemes that answer reachability queries using arithmetic operations. However, the prime numbers in these schemes can become very large when a tree contains a considerable number of nodes; thus, it is not scalable. Recently, a repetitive prime number labeling scheme that reduces space requirements was proposed. Unfortunately, it suffers from slow query processing, owing to the complexity of its reachability test. In this paper, we propose a more efficient method for answering reachability queries in a repetitive prime number labeling scheme. The results of experiments using real-world XML datasets show that our approach reduces reachability query processing times.

Highlights

  • A vast amount of tree-structured data on diverse domains is available in eXtensible MarkupLanguage (XML) files on the Web, such as SwissProt, DBLP, and Treebank

  • We focus on a prime number labeling scheme that does not require all the nodes to be re-labeled when some nodes are updated

  • A drawback of these approaches is that their inefficient method for performing reachability tests significantly reduces their usability in the case of large datasets

Read more

Summary

Introduction

A vast amount of tree-structured data on diverse domains is available in eXtensible Markup. A query that determines whether there exists a path between the two nodes of a given pair (source and target) is an important one for trees. It can be regarded as a reachability or ancestor–descendant query. An approach has been proposed to utilize the MapReduce framework in order to do prime number labeling of massive XML data [7]. The schemes’ shortcomings are apparent when the number of nodes becomes very large—the size of self labels increases as well. They did not consider the performance of answering reachability queries. We propose an efficient method for answering reachability queries in REP

Repetitive Prime Number Labeling Scheme
Improving on Answering Reachability Queries
Evaluation
Labeling Time
Space Requirements
Query Processing Time
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call