Abstract

Microaggregation is a commonly used technique for statistical disclosure control of microdata. It divides the microdata into groups such that each group contains no fewer than k records, where k is a user-specified parameter; then it replaces each group with the group's centroid. The problem underlying microaggrgation is called the k-Partitions problem. The k-Partitions problem is a constrained optimization problem where the objective is to minimize the information loss incurred from the replacement of raw data with their respective centroids, and the constraint is to limit the group size to be no fewer than k. In the literature, many clustering algorithms have been modified for the k-Partitions problem. For example, the k-Ward algorithm is derived from Ward's Hierarchical Clustering algorithm. In this paper, we propose a general form of the k-Ward algorithm, and compare its performance with the original k-Ward algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call