Abstract
Because of the high volume and unpredictable arrival rate, stream processing systems may not always be able to keep up with the input data streams - resulting in buffer overflow and uncontrolled loss of data. Load shedding, the prevalent strategy for solving this overflow problem, has so far only been considered for relational stream processing, but not for XML. Shedding applied to XML stream processing brings new opportunities and challenges due to complex nested nature of XML structures. In this paper, we tackle this unsolved XML shedding problem using a three-pronged approach. First, we develop an XQuery preference model that enables users to specify the relative importance of preserving different subpatterns in the XML result structure. This transforms shedding into the problem of rewriting the user query into shed queries that return approximate query answers with utility as measured by the given user preference model. Second, we develop a cost model to compare the performance of alternate shed queries. Third, we develop two shedding algorithms, OptShed and FastShed. OptShed guarantees to find an optimal solution however at the cost of exponential complexity. FastShed, as confirmed by our experiments, achieves a close-to-optimal result in a wide range of test cases. Finally we describe the in-automaton shedding mechanism for XQuery stream engines. The experiments show that our proposed utility-driven shedding solutions consistently achieve higher utility results compared to the existing relational shedding techniques.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.