Whenever randomness is involved in query processing, confidence intervals are commonly returned to the user to indicate the statistical significance of the query answer. However, this problem has not been explicitly addressed under differential privacy, which must use randomness by definition. For some classical mechanisms whose noise distribution does not depend on the input, such as the Laplace and the Gaussian mechanism, deriving confidence intervals is easy. But the problem becomes nontrivial for queries whose global sensitivity is large or unbounded, for which these classical mechanisms cannot be applied. There are three main techniques in the literature for dealing with such queries: the exponential mechanism, the sparse vector technique, and the smooth sensitivity. In this paper, for each of the three techniques we design mechanisms to produce confidence intervals that are (1) differentially private; (2) correct, i.e., the interval contains the true query answer with the specified confidence level; and (3) have a utility guarantee matching that of the original mechanism, up to constant factors. Then we show how to apply our techniques to a variety of problems ranging from simple statistics (e.g., mean, median, maximum) to graph pattern counting and conjunctive queries.
Read full abstract