Abstract

There are growing interests in algorithms for processing and querying continuous data streams recently. This paper introduces the problem of sampling from landmark windows over data streams and presents a weighted stratified multistage sampling (WSMS) algorithm for this problem. The algorithm extends the classic reservoir-sampling algorithm and the weighted sampling algorithm with a reservoir by using basic window technique, and works well even when the number of data items in the landmark window varies dramatically over time. The theoretic analysis and experiments show that the algorithm is effective and efficient for continuous data streams processing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call