Abstract

With the rapid advances in hardware technology, data streams are being generated daily in large volumes, enabling a wide range of real-time analytical tasks. Yet data streams from many sources are inherently sensitive, and thus providing continuous privacy protection in data streams has been a growing demand. In this paper, we consider the problem of private analysis of infinite data streams under differential privacy. We propose a novel data stream sanitization framework that periodically releases histograms summarizing the event distributions over sliding windows to support diverse data analysis tasks. Our framework consists of two modules, a sampling-based change monitoring module and a continuous histogram publication module. The monitoring module features an adaptive Bernoulli sampling process to accurately track the evolution of a data stream. We for the first time conduct error analysis of sampling under differential privacy, which allows to select the best sampling rate. The publication module features three different publishing strategies, including a novel technique called retroactive grouping to enjoy reduced noise. We provide theoretical analysis of the utility, privacy and complexity of our framework. Extensive experiments over real datasets demonstrate that our solution substantially outperforms the state-of-the-art competitors.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call