Importance partitioning in micro-aggregation

G Kokolakis,D Fouskakis

doi:10.1016/j.csda.2008.09.028

Abstract

One of the techniques of data holders for the protection of confidentiality of continuous data is that of micro-aggregation. Rather than releasing raw data (individual records), micro-aggregation releases the averages of small groups and thus reduces the risk of identity disclosure. At the same time the method implies loss of information and often distorts the data. Thus, the choice of groups is very crucial to minimize the information loss and the data distortion. No exact polynomial algorithms exist up to date for optimal micro-aggregation, and so the usage of heuristic methods is necessary. A heuristic algorithm, based on the notion of importance partitioning, is proposed and it is shown that compared with other micro-aggregation heuristics achieves improved performance.

Full Text