Abstract
Aim: Data anonymization aims to enable data publishing without compromising the individuals’ privacy. The reidentification and sensitive information inference risks of a dataset are important factors in the decision-making process for the techniques and the parameters of the anonymization process. If correctly assessed, measuring the reidentification and inference risks can help optimize the balance between protection and utility of the dataset, as too aggressive anonymization can render the data useless, while publishing data with a high risk of de-anonymization is troublesome. Methods: In this paper, a new information theoretic-based privacy metric (ITPR) for assessing both the re-identification risk and sensitive information inference risk of datasets is proposed. We compare the proposed metric with existing information theoretic metrics and their ability to assess risk for various cases of dataset characteristics. Results: We show that ITPR is the only metric that can effectively quantify both re-identification and sensitive information inference risks. We provide several experiments to illustrate the effectiveness of ITPR. Conclusion: Unlike existing information theoretic-based privacy metrics, the ITPR metric we propose in this paper is, to the best of our knowledge, the first information theoretic-based privacy metric that allows correctly assessing both re-identification and sensitive information inference risks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.