Enhanced ℓ – Diversity Algorithm for Privacy Preserving Data Mining

R Praveena Priyadarsini,P Amudha,S Sivakumari

doi:10.1007/978-981-10-3274-5_2

Abstract

With the increase in use of e-technologies, large amount of digital data are available on-line. These data are used by both internal and external sources for analysis and research. This digital data contain sensitive and personal information about the entities on which the data are collected. Due to this sensitive nature of such information, it needs some privacy preservation procedure to be applied before releasing the data to third parties. The privacy preservation should be applied on the data such that its utility during data mining does not get reduced. l-Diversity is an anonymization algorithm that can be applied on dataset with one sensitive attribute. Real life data contain numerous sensitive attributes that have to be privacy preserved before publishing it for research. This paper proposes an Enhanced l-diversity algorithm that can diversify multiple sensitive attributes without partitioning the dataset. Two datasets namely, bench mark Adult dataset and Real life Medical dataset are used for experimentation in this work. The privacy preserved datasets using the proposed algorithm are compared for its utility with l-diversified dataset for single sensitive attribute and original dataset. The results show that the proposed algorithm privacy preserved datasets have good utility on selected classification algorithms taken for study.

Full Text