Abstract

The distinct elements problem is one of the fundamental problems in streaming algorithms—given a stream of integers in the range { 1,… , n }, we wish to provide a (1+ε) approximation to the number of distinct elements in the input. After a long line of research an optimal solution for this problem with constant probability of success, using O (1/ε 2 +lg n ) bits of space, was given by Kane, Nelson, and Woodruff in 2010. The standard approach used to achieve low failure probability δ is to take the median of lg δ −1 parallel repetitions of the original algorithm. We show that such a multiplicative space blow-up is unnecessary: We provide an optimal algorithm using O (lg δ −1 /ε 2 + lg n ) bits of space—matching known lower bounds for this problem. That is, the lg δ −1 ; factor does not multiply the lg n term. This settles completely the space complexity of the distinct elements problem with respect to all standard parameters. We consider also the strong tracking (or continuous monitoring ) variant of the distinct elements problem, where we want an algorithm that provides an approximation of the number of distinct elements seen so far, at all times of the stream. We show that this variant can be solved using O (lg lg n + lg δ −1 /ε 2 + lg n ) bits of space, which we show to be optimal.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call