Abstract

As computation over sensitive data has been an important goal in recent years, privacy-preserving data analysis has gradually attracted more and more attention. Among various mechanisms, differential privacy has been widely studied due to its formal privacy guarantees for data analysis. As one of the most important issues, the crucial trade-off between the strength of privacy guarantee and the effect of analysis accuracy is highly concerned among the researchers. Existing theories for this issue consider that the analyst should first choose a privacy requirement and then attempt to maximize the utility. However, as differential privacy is gradually deployed in practice, a gap between theory and practice comes out: in practice, product requirements often impose hard accuracy constraints, and privacy (while desirable) may not be the over-riding concern. Thus, it is usually that the requirement of privacy guarantee is adjusted according to the utility expectation, not the other way around. This gap raises the question of how to provide maximum privacy guarantee for data analysis due to a given accuracy requirement. In this paper, we focus our attention on private Empirical Risk Minimization (ERM), which is one of the most commonly used data analysis method. We take the first step towards solving the above problem by theoretically exploring the effect of ϵ (the parameter of differential privacy that determines the strength of privacy guarantee) on utility of the learning model. We trace the change of utility with modification of ϵ and reveal an established relationship between ϵ and utility. We then formalize this relationship and propose a practical approach for estimating the utility under an arbitrary value of ϵ. Both theoretical analysis and experimental results demonstrate high estimation accuracy and broad applicability of our approach in practical applications. As providing algorithms with strong utility guarantees that also give privacy when possible becomes more and more accepted, our approach would have high practical value and may be likely to be adopted by companies and organizations that would like to preserve privacy but are unwilling to compromise on utility.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call