Abstract
Microaggregation is a masking mechanism to protect confidential data in a public release. This technique can produce a k-anonymous dataset where data records are partitioned into groups of at least k members. In each group, a representative centroid is computed by aggregating the group members and is published instead of the original records. In a conventional microaggregation algorithm, the centroids are computed based on simple arithmetic mean of group members. This naive formulation does not consider the proximity of the published values to the original ones, so an intruder may be able to guess the original values. This paper proposes a disclosure-aware aggregation model, where published values are computed in a given distance from the original ones to attain a more protected and useful published dataset. Empirical results show the superiority of the proposed method in achieving a better trade-off point between disclosure risk and information loss in comparison with other similar anonymization techniques.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.