Abstract

Anonymity data for multiple sensitive attributes in microdata publishing is a growing field at present. This field has several models for anonymizing such as k-anonymity and l-diversity. Generalization and suppression became a common technique in anonymize data. But, the real problem in multiple sensitive attributes is sensitive value distribution. If sensitive values do not distribute evenly to each quasi identifier group, it is potentially revealed to sensitive value holder. This research investigated on how the high-sensitive values are distributed evenly into each group. We proposed a novel method/algorithm for distributing high-sensitive values when it forms groups. This method distributes high-sensitive values evenly and varies high-sensitive values in a group. We called our method as extended systematic clustering since it is an extension of systematic clustering method. Diversity metrics was used for evaluating our method. Experiment result showed our method outperformed systematic clustering with average diversity value 0.9719 while systematic clustering 0.3316.

Highlights

  • Privacy is an important issue in publishing microdata table, while microdata contains information of individual dan identities data

  • The contributions of this research are, (1) we proposed a novel algorithm for distributing high-sensitive values to each quasi identifier group, (2) we successfully implemented our method in multiple sensitive attributes, (3) we categorized sensitive values and set it into sensitive attribute categorization

  • Three attributes that is decided as sensitive attributes are education, workclass, and occupation

Read more

Summary

Introduction

Privacy is an important issue in publishing microdata table, while microdata contains information of individual dan identities data. An individual data covers three type of attributes that is called explicit identifier (EI), quasi identifier (QI), and sensitive attributes (SA) [1, 2]. EI is an attribute that contains an identifier such as name, employee number, or student identifier. In privacy preserving data publishing (PPDP), QI attributes are generalized or suppressed for obtaining anonymity table. Some records that the QI attributes cannot be distinguished formed quasi identifier groups. A table that contains some groups which each group has at least k records is called k-anonymity table [3,4,5,6,7]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call