Abstract

Recording the frequency of items in highly skewed data streams is a fundamental and hot problem in recent years. The literature demonstrates that sketch is the most promising solution. The typical metrics to measure a sketch are accuracy and speed, but existing sketches make only trade-offs between the two dimensions. Our proposed solution is a new sketch framework called Stingy sketch with two key techniques: Bit-pinching Counter Tree ( BCTree ) and Prophet Queue ( PQueue ) which optimizes both the accuracy and speed. The key idea of BCTree is to split a large fixed-size counter into many small nodes of a tree structure, and to use a precise encoding to perform carry-in operations with low processing overhead. The key idea of PQueue is to use pipelined prefetch technique to make most memory accesses happen in L2 cache without losing precision. Importantly, the two techniques are cooperative so that Stingy sketch can improve accuracy and speed simultaneously. Extensive experimental results show that Stingy sketch is up to 50% more accurate than the SOTA of accuracy-oriented sketches and is up to 33% faster than the SOTA of speed-oriented sketches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call