Community Deception in Attributed Networks

Valeria Fionda,Giuseppe Pirrò

doi:10.1109/tcss.2022.3213722

Abstract

Community detection algorithms that analyze networks to identify communities of nodes are an essential part of the network analysis toolkit used daily by different analysts (e.g., data scientists and law enforcement). However, there is not enough awareness that members of a community <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\comH$</tex-math> </inline-formula> (either revealed or not) inside a network <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$G$</tex-math> </inline-formula> could act strategically to evade such tools either for legitimate (e.g., activist groups in authoritarian regimes) or malicious (e.g., terrorists) purpose. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Community deception</i> offers this possibility. By identifying a certain number of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\comH$</tex-math> </inline-formula> ’s member connections to be rewired, community deception algorithms can successfully hide a community that wants to stay below the radar of detection techniques. However, the state-of-the-art deception approaches have focused on networks without attributes, although real-world networks (e.g., Facebook) include attributes (e.g., age and sex) that play a central role in detecting more accurate communities. This article faces three novel challenges introduced when designing deception techniques for networks with attributes. The first concerns how to model and encode attributes most flexibly. The second is about framing attribute-aware community deception as an optimization problem. Finally, the challenge of solving the optimization problem by leveraging network topology and attributes also arises. We leverage a simple way to model network attributes as edge weights, a novel optimization function called community diffusion, and a greedy algorithm to optimize diffusion, to solve the above challenges. We evaluated against several community detection algorithms and compared it with state-of-the-art deception approaches on various real-world networks. From the evaluation, we can draw two main observations. First, adopting attribute-oblivious deception techniques leads to unsatisfactory results. Second, community diffusion as an optimization function specific to attributed networks is preferred to community safeness, the state-of-the-art deception optimization function, even when recasting the latter as an attribute-aware function.

Full Text