Abstract

Randomized response (RR) methods have long been suggested for protecting respondents’ privacy in statistical surveys. However, how to set and achieve privacy protection goals have received little attention. We give a full development and analysis of the view that a privacy mechanism should ensure that no intruder would gain much new information about any respondent from his response. Formally, we say that a privacy breach occurs when an intruder’s prior and posterior probabilities about a property of a respondent, denoted $p$ and $p_{*}$, respectively, satisfy $p_{*} h_{u}(p)$, where $h_{l}$ and $h_{u}$ are two given functions. An RR procedure protects privacy if it does not permit any privacy breach. We explore effects of $(h_{l},h_{u})$ on the resultant privacy demand, and prove that it is precisely attainable only for certain $(h_{l},h_{u})$. This result is used to define a canonical strict privacy protection criterion, and give practical guidance on the choice of $(h_{l},h_{u})$. Then, we characterize all privacy satisfying RR procedures and compare their effects on data utility using sufficiency of experiments and identify the class of all admissible procedures. Finally, we establish an optimality property of a commonly used RR method.

Highlights

  • In recent years, businesses, organizations and government agencies have been gathering increasingly vast amounts of data from surveys, commercial transactions, on-line searches and postings, medical records and other sources, and heavily using data analytics in making business and policy decisions

  • We obtain a complete characterization of all randomized response (RR) procedures that satisfy any specified privacy criterion

  • We develop a canonical form of the general criterion and characterize all RR procedures that provide required privacy

Read more

Summary

Introduction

Businesses, organizations and government agencies have been gathering increasingly vast amounts of data from surveys, commercial transactions, on-line searches and postings, medical records and other sources, and heavily using data analytics in making business and policy decisions. Nayak et al (2015) proposed a similar criterion, called β-factor privacy. (a) An RR procedure is said to permit an upward ρ1-to-ρ2 privacy breach with respect to Q ⊆ SX and a prior distribution α if for some 1 ≤ i ≤ m with. (Nayak et al, 2015) For a given β > 1, an RR procedure admits a β-factor privacy breach, with respect to Q ⊆ SX and a prior α if. We should mention that Boreale and Paolini’s (2015) concept of “worst-case breach” is essentially the same as β-factor breach They proved a version of Theorem 1.2. We develop a canonical form of the general criterion and characterize all RR procedures that provide required privacy.

A general criterion
Characterization of strict information privacy
Comparison of data utility
Admissibility
Optimality results
Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.