Abstract

ABSTRACTWorking under a model of privacy in which data remain private even from the statistician, we study the tradeoff between privacy guarantees and the risk of the resulting statistical estimators. We develop private versions of classical information-theoretical bounds, in particular those due to Le Cam, Fano, and Assouad. These inequalities allow for a precise characterization of statistical rates under local privacy constraints and the development of provably (minimax) optimal estimation procedures. We provide a treatment of several canonical families of problems: mean estimation and median estimation, generalized linear models, and nonparametric density estimation. For all of these families, we provide lower and upper bounds that match up to constant factors, and exhibit new (optimal) privacy-preserving mechanisms and computationally efficient estimators that achieve the bounds. Additionally, we present a variety of experimental results for estimation problems involving sensitive data, including salaries, censored blog posts and articles, and drug abuse; these experiments demonstrate the importance of deriving optimal procedures. Supplementary materials for this article are available online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call