Privacy security issues under the classic randomized response (RR) model proposed by Warner and its extended $K$ -RR model are studied. First, in order to provide references for the accuracy of the private distribution estimation problem under RR mechanism, the lower bounds of the differential privacy parameter and the number of participates are deduced for an accuracy objective of $({\alpha }, {\delta })$ – accurate in statistics. Second, when the prior distribution is nonuniform, the data utility has a ceiling effect in the high privacy region by taking the prior into account. In this case, the average distortion, which is defined as the expected Hamming distance between the input and output data, is no longer feasible to measure the data utility. Motivated by this, the error probability is proposed as a measure of data utility for unifying different privacy metrics, where the error probability is defined to be the expected Hamming distance between the input and reconstructed data based on maximum a posteriori estimation. Third, under a unified privacy-preserving framework using RR mechanism based on error probability criterion, the relationship among differential privacy, identifiability privacy, and mutual information privacy is established. Given a maximum allowable error probability $P_{E}^{\mathrm {max}}$ , the optimal privacy parameters of these three privacy notions are derived with the full consideration of the prior distribution. Then, a Bayes-based utility function, which corresponds to the converse of the Bayes risk, is constructed to measure the degree of privacy leakage. Given a maximum allowable correct probability $P_{C}^{\mathrm {max}}$ , the accuracy objective of the statistical estimate is considered to derive the range of local differential privacy parameter from the perspective of security. Fourth, all the research results above are further extended to $K$ -RR mechanism too. Finally, the correctness and effectiveness are further verified by simulation experiments. The results reveal that the error probability can be applied to any prior distribution case, while the average distortion criterion is only a special case with uniform distribution. Therefore, the error probability proposed in the paper is more reasonable to be used as a common criterion to measure the data utility for the RR model so as to unify different privacy metrics.
Read full abstract