A criterion for privacy protection in data collection and its attainment via randomized response procedures

Jichong Chai,Tapan K Nayak

doi:10.1214/18-ejs1508

Abstract

Randomized response (RR) methods have long been suggested for protecting respondents’ privacy in statistical surveys. However, how to set and achieve privacy protection goals have received little attention. We give a full development and analysis of the view that a privacy mechanism should ensure that no intruder would gain much new information about any respondent from his response. Formally, we say that a privacy breach occurs when an intruder’s prior and posterior probabilities about a property of a respondent, denoted $p$ and $p_{*}$, respectively, satisfy $p_{*} h_{u}(p)$, where $h_{l}$ and $h_{u}$ are two given functions. An RR procedure protects privacy if it does not permit any privacy breach. We explore effects of $(h_{l},h_{u})$ on the resultant privacy demand, and prove that it is precisely attainable only for certain $(h_{l},h_{u})$. This result is used to define a canonical strict privacy protection criterion, and give practical guidance on the choice of $(h_{l},h_{u})$. Then, we characterize all privacy satisfying RR procedures and compare their effects on data utility using sufficiency of experiments and identify the class of all admissible procedures. Finally, we establish an optimality property of a commonly used RR method.

Highlights

In recent years, businesses, organizations and government agencies have been gathering increasingly vast amounts of data from surveys, commercial transactions, on-line searches and postings, medical records and other sources, and heavily using data analytics in making business and policy decisions
We obtain a complete characterization of all randomized response (RR) procedures that satisfy any specified privacy criterion
We develop a canonical form of the general criterion and characterize all RR procedures that provide required privacy

Summary

Introduction

Businesses, organizations and government agencies have been gathering increasingly vast amounts of data from surveys, commercial transactions, on-line searches and postings, medical records and other sources, and heavily using data analytics in making business and policy decisions. Nayak et al (2015) proposed a similar criterion, called β-factor privacy. (a) An RR procedure is said to permit an upward ρ1-to-ρ2 privacy breach with respect to Q ⊆ SX and a prior distribution α if for some 1 ≤ i ≤ m with. (Nayak et al, 2015) For a given β > 1, an RR procedure admits a β-factor privacy breach, with respect to Q ⊆ SX and a prior α if. We should mention that Boreale and Paolini’s (2015) concept of “worst-case breach” is essentially the same as β-factor breach They proved a version of Theorem 1.2. We develop a canonical form of the general criterion and characterize all RR procedures that provide required privacy.

A general criterion

Characterization of strict information privacy

Comparison of data utility

Admissibility

Optimality results

Discussion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2018
Citations: 4	License type: cc-by

R Discovery Prime

R Discovery Prime

A criterion for privacy protection in data collection and its attainment via randomized response procedures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

A Concise Theory of Randomized Response Techniques for Privacy and Confidentiality ProtectionaaThe views expressed in this chapter are those of the authors and not necessarily those of the US Census Bureau. The analysis and conclusions contained in this chapter are those of the authors and do not represent the official position of the US Energy Information Administration (EIA) or the
T.K Nayak ... S.A Adeshiyan
-
T.K Nayak, et. al.T.K Nayak ... S.A Adeshiyan
01 Jan 2015
01 Jan 2015

The Evolution of Privacy Law and Policy in the Netherlands
Bert-Jaap Koops
Journal of Comparative Policy Analysis: Research and Practice | VOL. 13
Bert-Jaap KoopsBert-Jaap Koops
01 Apr 2011
Journal of Comparative Policy Analysis: Research and Practice | VOL. 13

A unified framework for analysis and comparison of randomized response surveys of binary characteristics
Tapan K Nayak ... Samson A Adeshiyan
Journal of Statistical Planning and Inference | VOL. 139
Tapan K Nayak, et. al.Tapan K Nayak ... Samson A Adeshiyan
06 Jan 2009
Journal of Statistical Planning and Inference | VOL. 139

SPARR: Spintronics-based private aggregatable randomized response for crowdsourced data collection and analysis
Yao-Tung Tsou ... Bor-Doou Rong
Computer Communications | VOL. 152
Yao-Tung Tsou, et. al.Yao-Tung Tsou ... Bor-Doou Rong
09 Jan 2020
Computer Communications | VOL. 152

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A criterion for privacy protection in data collection and its attainment via randomized response procedures

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics