Online Social Networks (OSNs) face the major challenge of protecting participant's privacy, due to the high dimensionality and volume of the data. In real-time social networks, where hundreds of personal details of people are shared every day, there remains a significant threat to privacy. Privacy preservation is challenging for a community detection problem due to the high computational complexity and memory requirements, especially in larger real-world OSN graphs. Although weighted nodes provide better results, as they allow capturing the frequencies of the values, the privacy preservation of sensitive attributes such as specific profiles becomes harder compared to these models. This problem can result in queries and subsequent learnings from social network profiles of specific individuals, which may be of personal, political or otherwise concern. Online social networks (OSNs) grapple with significant privacy challenges due to the extensive dimensions and vast quantities of data involved. This research fills the void in current privacy-preserving community detection methodologies, which face problems in computational complexity and memory usage in large-scale OSNs. The proposed framework seeks to bolster privacy preservation through a comprehensive multi-step process. Data filtering uses a blended data filter system to remove outliers and irrelevant data, thus enhancing the quality of the input data. Density-Oriented Clustering phase employs a density-oriented clustering model to identify communities, with each cluster representing a distinct community. Privacy Preservation component introduces a new privacy preservation technique for sensitive OSN attributes, surpassing existing k-anonymization methods. The developed density-based social network community detection model and its novel privacy-preserving scheme are evaluated using the datasets Yelp, Football, Zachary and Dolphin from the SNAP dataset. Experimental results on these datasets embed a comprehensive evaluation based on the order of each node and the graph-based networks, where each node is laden with weights as proximity values, indicating the semantic proximity between communities and individuals. The proposed framework employs the normalized mutual information (NMI), modularity (Q), Rand index and runtime measurements to demonstrate widespread advantages in the multi-dimensional functional space, including greater accuracy, cluster compatibility, and computational tractability over the existing prominent traditional models for OSNs.
Read full abstract