Abstract

XML (extensible mark-up language) has emerged as one of the popular data representation standards for information storage and exchange. In this paper, we propose an extended INLAB architecture, INLAB2, focusing on preprocessing the XML document for fast native storage and accurate query retrieval. Firstly, we propose our xParse parser to check the well-formedness of an XML document. Next, we use a ( self-end) labeling scheme to encode each element in the XML database, by its positional information, to establish parent-child (P-C) or ancestor-descendant (A-D) relationships between nodes. Subsequently, our TwigINLAB2 algorithm is used to optimize query retrieval. TwigINLAB2 is a generalization of TwigStack, the stack-based algorithm for matching twig query. However, the TwigStack algorithm is efficient for A-D relationship queries only. Thus, in order to overcome this limitation, we enhance query retrieval by utilizing indices to speed up the matching and merging phases. Experimental results indicate that TwigINLAB2 can, on an average, process twig queries 23% better than the TwigStack algorithm and 10% better than TwigINLAB1, in terms of execution time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call