Abstract

The growing importance of XML and the lack of efficient solutions for managing and querying XML data have led to the development of hybrid systems. We present a hybrid system, TwigX-Guide; an extension of the well-known DataGuide index and region encoding labeling to support twig query processing. With TwigX-Guide, a complex query can be decomposed into a set of path queries, which are evaluated individually by retrieving the path or node matches from the DataGuide index table and subsequently joining the results using the holistic twig join algorithm TwigStack. TwigX-Guide improves the performance of TwigStack for queries with parent-child relationships and mixed relationships by reducing the number of joins needed to evaluate a query. Experimental results indicate that TwigX-Guide can process path and twig queries on an average 38% better than the TwigStack algorithm, 29% better than TwigINLAB and 11% better than TwigStackList in terms of execution time.

Highlights

  • With the rapid emergence of XML as an enabler for data exchange and data transfer over the Web, querying XML data has become a major concern

  • DataGuide[1] indexes each distinct raw data path to facilitate the evaluation of simple path expression

  • We propose the TwigX-Guide system architecture, which extends the existing DataGuide and TwigStack to accelerate twig query processing

Read more

Summary

Introduction

With the rapid emergence of XML as an enabler for data exchange and data transfer over the Web, querying XML data has become a major concern. Since XML is a semi-structured data, there are typically two types of user queries; namely full-text queries (keyword-based search) and structural queries (complex queries specified in tree-like structure). Many researchers have complemented it with indexes to address the degradation problem due to excessive traversal. These indexes reduce the search scope by creating and traversing the path summary of the XML tree instead. DataGuide[1] indexes each distinct raw data path to facilitate the evaluation of simple path expression. Path indexing greatly speeds up the evaluation of path queries with P-C edges, it needs expensive join operations for processing queries with multiple branches and queries with A-D edges

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.