Abstract

Existing sensitive attributes diversity models do not capture the semantic similarity between sensitive values, so they cannot resist semantic similarity attack. To address the problem, we present a method to measure semantic similarity of a categorical sensitive attribute based on the attribute’ semantic hierarchy tree. On basis of the measurement, the paper proposes a ( l , e )-diversity model which has two constraints in each equivalence class: (1) there are at least l well-represented values; (2) any two sensitive values are not e -similar. Furthermore, the paper designs a liner-complexity maximum bucketization greedy algorithm to implement the model. Experimental results show that the anonymous data satisfied ( l , e )-diversity has a higher diversity degree than that satisfied l -diversity, so ( l , e )-diversity can protect privacy more effectively than l -diversity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.