Node2vec with weak supervision on community structures

Swarup Chattopadhyay,Debasis Ganguly

doi:10.1016/j.patrec.2021.06.024

Abstract

• A weakly supervised node embedding algorithm. • Communities detected by combinatorial approaches used as weak labels for learning the embedding objective. • Reduces distances between node pair vectors that are likely to belong to the same community over the ones that do not. • Demonstrated to be effective on both arti cial and real-world networks. Detecting communities or the modular structure of real-life networks (e.g. a social network or a product purchase network) is an important task because the way a network functions is often determined by its communities. Traditional approaches to community detection involve modularity-based algorithms, which generally speaking, construct partitions based on heuristics that seek to maximize the ratio of the edges within the partitions to those between them. On the other hand, node embedding approaches represent each node in a graph as a real-valued vector and is thereby able to transform the problem of community detection in a graph to that of clustering a set of vectors. Existing node embedding approaches are primarily based on, first, initiating random walks from each node to construct a context of a node, and then make the vector representation of a node close to its context. However, standard node embedding approaches do not directly take into account the community structure of a network while constructing the context around each node. To alleviate this, we propose a community structure aware node embedding approach, where we incorporate an initial combinatorial approach-based partition information into the objective function of node embedding. We demonstrate that our proposed combination of the combinatorial and the embedding approaches for community detection outperforms a number of combinatorial-based baselines on a wide range of real-life and synthetic networks of different sizes and densities.

Full Text