Abstract

One of the most important uses of aggregate queries over data streams is sampling. Typically, aggregation is performed over sliding windows where queries return new results whenever the window contents change, a concept referred to as a continuous query. Existing data models and query languages for streams are not capable of expressing many practical user-defined samplings over streams. To this end we propose a new data stream model, referred to as the sequence model, and a query language for specifying aggregate queries over data streams. We show that the sequence model can readily express a superset of the aggregate queries expressible in the previously proposed time-based data stream model, thus providing a declarative and formal semantics to understand and reason about continuous aggregate queries. Defined on top of the sequence model, our query language supports existing sliding window operators and a novel frequency operator. By using the frequency operator one is capable of expressing useful sampling queries, such as queries with user-defined group-based sampling and nested aggregation over either the input stream or the result stream. Such capabilities are beyond those of previously proposed query languages over streams. Finally, we conduct a preliminary experimental study that shows our language is effective and efficient in practice.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.