Privacy Amplification by Sampling under User-level Differential Privacy

Juanru Fang,Ke Yi

doi:10.1145/3639289

Abstract

Random sampling is an effective tool for reducing the computational costs of query processing in large databases. It has also been used frequently for private data analysis, in particular, under differential privacy (DP). An interesting phenomenon that the literature has identified, is that sampling can amplify the privacy guarantee of a mechanism, which in turn leads to reduced noise scales that have to be injected. All existing privacy amplification results only hold in the standard, record-level DP model. Recently, user-level differential privacy (user-DP) has gained a lot of attention as it protects all data records contributed by any particular user, thus offering stronger privacy protection. Sampling-based mechanisms under user-DP have not been explored so far, except naively running the mechanism on a sample without privacy amplification, which results in large DP noises. In fact, sampling is in even more demand under user-DP, since all state-of-the-art user-DP mechanisms have high computational costs due to the complex relationships between users and records. In this paper, we take the first step towards the study of privacy amplification by sampling under user-DP, and give the amplification results for two common user-DP sampling strategies: simple sampling and sample-and-explore. The experimental results show that these sampling-based mechanisms can be a useful tool to obtain some quick and reasonably accurate estimates on large private datasets.

Full Text