Semantics-based sensitive topic diffusion detection framework towards privacy aware online social networks

Chinnaiah Valliyammai,Anbalagan Bhuvaneswari

doi:10.1007/s10586-018-2142-y

Abstract

The advent of sharing sensitive information via Online Social Networks (OSN) has jeopardized the user to the extent that the privacy of millions of OSN users could well be compromised, with their data openly available in the public domain. Evidently, users lack in data privacy and the access control mechanisms available to avoid the risk of disclosure. Therefore a framework that automatically preserves the user privacy to detect sensitive topic and minimize the risk of sensitive information disclosure risk beyond the current privacy sceneries offered by OSN service providers is required. In this paper, we present a three-fold sanitization framework which precisely detects sensitive topics semantically using statistical topic model scheme which incorporates standard knowledge bases for tagging the sensitive topics discovered. The interaction documents from location-of-interest are subjected to SSAR–LDA using Gibbs Sampling to identify sensitive topic clusters with high location entropy. The experimental result shows, (i) the sensitive topic clusters are identified with very high accuracy, (ii) despite the redaction approach, which eliminate the sensitive term, our proposed scheme enhance the privacy preserving policy by replacing the sensitive terms with suitable hierarchical generalization fetched from knowledge bases (iii) the probability of Kullback–Leibler (KL) divergence between sensitive and generalized sanitization terms on Twitter, with negligible information disclosure risk is acceptable, and (iv) the sanitization carried out for 10 sensitive topics, from 4500 user posts of 790 Twitter users, demonstrated high precision and recall, which can be correlated with advanced privacy settings for OSN users in the near future.

Full Text