Abstract

Several applications based on XML stream processing have recently emerged, such as those for air traffic control and the selective dissemination of information (SDI). Their common need is to process a large number of XPath expressions in continuous XML streams at high throughput.This paper proposes four techniques for XPath expression processing based on Deterministic Finite Automata (DFA) for two purposes: to improve the memory usage efficiency of the automata and to support the processing of branching XPath expressions. The first technique, called n-DFA, clusters the given XPath expressions into n clusters to reduce the number of DFA states. The second, called shared NFA state table, lets the Non-Deterministic Finite Automata (NFA) state set be shared among the DFA states. Our experiments show that memory usage in an 8-DFA can, with the shared NFA state table, be reduced to 1/40th that of the original 1-DFA. The optimized NFA conversion and general XPath expression processing algorithm techniques contribute to the processing of branching XPath expressions efficiently; overall performance is better than is possible with earlier approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.