Abstract

Anonymization is a practical solution for preserving user’s privacy in data publishing. Data owners such as hospitals, banks, social network (SN) service providers, and insurance companies anonymize their user’s data before publishing it to protect the privacy of users whereas anonymous data remains useful for legitimate information consumers. Many anonymization models, algorithms, frameworks, and prototypes have been proposed/developed for privacy preserving data publishing (PPDP). These models/algorithms anonymize users’ data which is mainly in the form of tables or graphs depending upon the data owners. It is of paramount importance to provide good perspectives of the whole information privacy area involving both tabular and SN data, and recent anonymization researches. In this paper, we presents a comprehensive survey about SN (i.e., graphs) and relational (i.e., tabular) data anonymization techniques used in the PPDP. We systematically categorize the existing anonymization techniques into relational and structural anonymization, and present an up to date thorough review on existing anonymization techniques and metrics used for their evaluation. Our aim is to provide deeper insights about the PPDP problem involving both graphs and tabular data, possible attacks that can be launched on the sanitized published data, different actors involved in the anonymization scenario, and major differences in amount of private information contained in graphs and relational data, respectively. We present various representative anonymization methods that have been proposed to solve privacy problems in application-specific scenarios of the SNs. Furthermore, we highlight the user’s re-identification methods used by malevolent adversaries to re-identify people uniquely from the privacy preserved published data. Additionally, we discuss the challenges of anonymizing both graphs and tabular data, and elaborate promising research directions. To the best of our knowledge, this is the first work to systematically cover recent PPDP techniques involving both SN and relational data, and it provides a solid foundation for future studies in the PPDP field.

Highlights

  • Most organizations such as hospitals, banks, insurance companies, and supermarkets collect relevant customers/ subscribers data to improve service quality (SQ)

  • We mainly present the overview of data collected from the individuals, different actors involved in the anonymization scenario, anonymization techniques applied on respective data, anonymous data to be published for analytics/mining purposes, and privacy breaches that can occur during published data analytics

  • SUMMARY AND DISCUSSION ABOUT THE PRIVACY ISSUES IN FUTURE COMPUTING PARADIGM In this article, we have covered most of the concepts related to the anonymization approaches used for data owned by both physical organizations and virtual platforms (e.g., Facebook, Twitter, and Link-din etc.)

Read more

Summary

INTRODUCTION

Most organizations such as hospitals, banks, insurance companies, and supermarkets collect relevant customers/ subscribers data to improve service quality (SQ). The contributions of this review article in the field of PPDP is summarized as: (i) it presents state-of-the-art anonymization techniques used for both SN (i.e., social graphs) and relational (i.e., tabular) data, and fundamental concepts and ideas related to tables and graph data anonymization; (ii) it systematically categorizes the existing anonymization techniques into relational and structural anonymization, and presents an up-to-date thorough review on existing anonymization techniques and metrics used for their evaluation; (iii) it describes the anonymization techniques that have been proposed to solve privacy problems in application-specific scenarios (e.g., collaborative filtering, topic and context modeling, and community clustering etc.) of the SNs; (iv) it presents various methods and items that are exploited by malevolent adversaries for user’s re-identification across SNs; (v) it explains various challenges faced by researchers while devising new anonymization methods for tabular and SN data; (vi) it provides new insights on the privacy problems in future computing paradigm that will be helpful in devising more secure anonymization methodologies; and (vii) it discusses promising future research directions in the field of the PPDP that need further development and research from both academia and industry Through this comprehensive overview, we hope to provide a solid foundation for future studies in the PPDP area.

BACKGROUND
PHASE 1
PHASE 2
PHASE 3
PHASE 5
RELATIONAL ANONYMIZATION TECHNIQUES USED FOR TABULAR DATA ANONYMIZATION
DIFFERENTIAL PRIVACY MODEL
SUMMARY AND DISCUSSION ABOUT THE PRIVACY ISSUES IN FUTURE COMPUTING PARADIGM
Findings
VIII. CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call