A parallel space saving algorithm for frequent items and the Hurwitz zeta distribution

Massimo Cafaro,Marco Pulimeno,Piergiulio Tempesta

doi:10.1016/j.ins.2015.09.003

A parallel space saving algorithm for frequent items and the Hurwitz zeta distribution

Massimo Cafaro, Marco Pulimeno + Show 1 more

Open Access

https://doi.org/10.1016/j.ins.2015.09.003

Copy DOI

Journal: Information sciences	Publication Date: Sep 10, 2015
Citations: 94	License type: other-oa

Affiliation: University of Salento, Institute of Mathematical Sciences

#Space Saving Algorithm #Frequent Items + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We present a message-passing based parallel version of the Space Saving algorithm designed to solve the k–majority problem. The algorithm determines in parallel frequent items, i.e., those whose frequency is greater than a given threshold, and is therefore useful for iceberg queries and many other different contexts. We apply our algorithm to the detection of frequent items in both real and synthetic datasets whose probability distribution functions are a Hurwitz and a Zipf distribution respectively. Also, we compare its parallel performances and accuracy against a parallel algorithm recently proposed for merging summaries derived by the Space Saving or Frequent algorithms.

Full Text