Abstract

Sketch is a data structure used to record frequencies of items in a multiset, which is widely used in data streams, data graph, distributed datasets processing, etc. It works with small memory usage and a high speed at the cost of a slight inaccuracy. In practice, frequencies of items in many datasets are non-uniformly distributed. Unfortunately, existing sketches can hardly work well on non-uniform datasets. To address this issue, we propose a new sketch framework, namely ABC framework, which can be applied to most existing sketches and can significantly improve the accuracy on non-uniform datasets. The key idea behind our framework is that when a counter overflows, it makes use of the space from the adjacent counters by operations of bits-borrowing and combination. Extensive experimental results show that our ABC framework improves the accuracy by 4.10 times and 4.49 times in average, respectively. A demo and all the related source codes are available on our homepage [1].

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call