Abstract

Statistical disclosure control (SDC), also termed inference control two decades ago, is an integral part of data security dealing with the protection of statistical databases. The basic problem in SDC is to release data in a way that does not lead to disclosure of individual information (high security) but preserves the informational content as much as possible (low information loss). SDC is dual with data mining in that progress of data mining techniques forces official statistics to a continual improvement of SDC techniques: the more powerful the inferences that can be made on a released data set, the more protection is needed so that no inference jeopardizes the privacy of individual respondents’ numerical data. This paper deals with the computational complexity of optimal microaggregation, where optimal means yielding minimal information loss for a fixed security level. More specifically, we show that the problem of optimal microaggregation cannot be exactly solved in polynomial time. This result is relevant because it provides theoretical justification for the lack of exact optimal algorithms and for the current use of heuristic approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call