Abstract
The development of several popular social networks and the publication of social networks’ data have led to the risk of leakage of sensitive and confidential information of individuals. This requires the preservation of privacy before the publication of a user’s data available from his Online Social Network (OSN) presence. Numerous algorithms have been proposed in the area of preserving the privacy of social network users’ information such as K-anonymity and L-diversity. Previous work has shown good results based on the concept of adding edges and noise nodes for achieving K-anonymity and L-diversity. K-anonymization techniques are able to prevent identity disclosure of users but are not sufficient to prevent the disclosure of sensitive information of users. In this direction, a number of techniques for preserving the sensitive information of social network users have been proposed. Although these techniques have shown reasonably good results to achieve anonymity, but they also lead to a substantial change in the original structure of the OSNs. In this article, the problems of preventing sensitive attribute disclosure and reducing the noisy nodes have been addressed by perturbing the sensitive attributes. Existing research uses L-diversity for preventing sensitive attribute disclosure resulting in skewness and similarity attacks. We have addressed the skewness attacks by removing the duplicate noisy nodes from the final dataset to be published for stakeholders by the OSN service providers. All the information of duplicate nodes has been stored in a table named Reference Attribute Table (RAT). This table will be accessible only to the service providers for the purpose of de-anonymizing the data of users. The proposed technique has been extensively evaluated using five metrics viz. APL, ACSPL, RRTI, number of noisy nodes, and information loss using four real-time datasets collected for OSNs namely CORA, ARNET, DBLP, and Twitter. Results of evaluation parameters viz. APL and RRTI show that there is less change in the structure of datasets after anonymization. Results of ACSPL show that our proposed technique is able to preserve sensitive attributes in the datasets. The maximum number of noisy nodes amongst all four datasets is 5.4% and the maximum information loss is 2.2%. Evaluation results make it evident that our proposed technique ensures privacy preservation with less loss of information and thus preserving the utility of published data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.