Abstract

Sampling is the most versatile approximation technique available and is still one of the most powerful methods for building a one-pass synopsis of a data set in a streaming environment. Throughout the detailed review, a kind of taxonomic frame of sampling algorithms was presented; meanwhile, discussions and comparisons of representative sampling algorithms were performed. Due to the limitations of uniform sampling in some applications, the importance of using biased sampling methods in these scenarios was fully dissertated. Subsequently, we surveyed the application and development of sampling techniques, especially those traditional sampling techniques in data stream model. Finally, we discussed the research challenges and future directions of sampling problem in the context of data streams.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call