A General Method for Estimating Correlated Aggregates over a Data Stream

Srikanta Tirthapura,David P Woodruff

doi:10.1109/icde.2012.62

Abstract

On a stream of two dimensional data items $(x,y)$ where $x$ is an item identifier, and $y$ is a numerical attribute, a correlated aggregate query requires us to first apply a selection predicate along the second ($y$) dimension, followed by an aggregation along the first ($x$) dimension. For selection predicates of the form $(y , c)$, where parameter $c$ is provided at query time, we present new streaming algorithms and lower bounds for estimating statistics of the resulting sub stream of elements that satisfy the predicate. We provide the first sub linear space algorithms for a large family of statistics in this model, including frequency moments. We experimentally validate our algorithms, showing that their memory requirements are significantly smaller than existing linear storage schemes for large datasets, while simultaneously achieving fast per-record processing time. We also study the problem when the items have weights. Allowing negative weights allows for analyzing values which occur in the symmetric difference of two datasets. We give a strong space lower bound which holds even if the algorithm is allowed up to a logarithmic number of passes over the data(before the query is presented). We complement this with a small space algorithm which uses a logarithmic number of passes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A General Method for Estimating Correlated Aggregates over a Data Stream

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A General Method for Estimating Correlated Aggregates Over a Data Stream
Srikanta Tirthapura ... David P Woodruff
Algorithmica | VOL. 73
Srikanta Tirthapura, et. al.Srikanta Tirthapura ... David P Woodruff
06 Aug 2014
Algorithmica | VOL. 73

데이터 스트림 상에서 다중 연속 질의 처리를 위한 속성기반 접근 기법
Hyun-Ho Lee ... Won-Suk Lee
The KIPS Transactions:PartD | VOL. 14D
Hyun-Ho Lee, et. al.Hyun-Ho Lee ... Won-Suk Lee
31 Aug 2007
The KIPS Transactions:PartD | VOL. 14D

Current Summary of the Practical Using of Optical Correlators
Tomáš Harasthy ... Ján Turán
Acta Electrotechnica et Informatica | VOL. 12
Tomáš Harasthy, et. al.Tomáš Harasthy ... Ján Turán
01 Jan 2012
Acta Electrotechnica et Informatica | VOL. 12

Adaptive Language Processing Unit for Malaysian Sign Language Synthesizer
Haris Al Qodri Maarif
IAES International Journal of Robotics and Automation (IJRA) | VOL. 10
Haris Al Qodri MaarifHaris Al Qodri Maarif
01 Dec 2021
IAES International Journal of Robotics and Automation (IJRA) | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A General Method for Estimating Correlated Aggregates over a Data Stream

Abstract

Talk to us

Similar Papers