Abstract

Organizations releasing data for use in statistical studies have an ethical obligation to protect the confidentiality of individual respondents. A seemingly attractive way of ensuring confidentiality is to added zero-mean random numbers to the attributes of the released records. If this method is to be effective, however, it is necessary to measure how well it ensures confidentiality. This paper proposes measures of confidentiality and data integrity for multivariate data when noise has been added. These measures are evaluated in the case that the vectors of added noise have independent components. It is demonstrated that the amount of protection provided could be relatively low for data of high dimension. The optimal covariance matrix for the added noise vectors is derived, where the optimality is for the problem of optimizing confidentiality while preserving data integrity or vice versa. It is shown that the covariance structure of the added noise vectors should be the same as the covariance structure of the original population. This type of added noise turns out to have been proposed by Kim (1986) for the purpose of eliminating bias in parameter estimates.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call