Abstract

This paper proposes a generic anonymization approach for person-specific data, which retains more information for data mining and analytical purposes while providing considerable privacy. The proposed approach takes into account the usefulness and uncertainty of attributes while anonymizing the data to significantly enhance data utility. We devised a method for determining the usefulness weight for each attribute item in a dataset, rather than manually deciding (or assuming based on domain knowledge) that a certain attribute might be more useful than another. We employed an information theory concept for measuring the uncertainty regarding sensitive attribute’s value in equivalence classes to prevent unnecessary generalization of data. A flexible generalization scheme that simultaneously considers both attribute usefulness and uncertainty is suggested to anonymize person-specific data. The proposed methodology involves six steps: primitive analysis of the dataset, such as analyzing attribute availability in the data, arranging the attributes into relevant categories, and sophisticated pre-processing, computing usefulness weights of attributes, ranking users based on similarities, computing uncertainty in sensitive attributes (SAs), and flexible data generalization. Our methodology offers the advantage of retaining higher truthfulness in data without losing guarantees of privacy. Experimental analysis on two real-life benchmark datasets with varying scales, and comparisons with prior state-of-the-art methods, demonstrate the potency of our anonymization approach. Specifically, our approach yielded better performance on three metrics, namely accuracy, information loss, and disclosure risk. The accuracy and information loss were improved by restraining heavier anonymization of data, and disclosure risk was improved by preserving higher uncertainty in the SA column. Lastly, our approach is generic and can be applied to any real-world person-specific tabular datasets encompassing both demographics and SAs of individuals.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.