Community is a fundamental and highly desired pattern in a Large-scale Undirected Network (LUN). Community detection is a vital issue when LUN representation learning is performed. Owing to its good scalability and interpretability, a Symmetric and Non-negative Matrix Factorization model is frequently utilized to tackle this issue. It adopts a unique Latent Factor (LF) matrix for precisely representing LUN’s symmetry, which, unfortunately, leads to a reduced LF space that decreases its representation learning ability to a target LUN. Motivated by this discovery, this study proposes a Symmetry and Graph Bi-regularized Non-negative Matrix Factorization (B-NMF) method that: a) leverages multiple LF matrices when representing LUN, thereby boosting the representation learning ability; b) constructs a symmetry regularization term that implies the equality constraint among its multiple LF matrices, thereby illustrating LUN’s intrinsic symmetry; and c) incorporates graph regularization into its learning objective, thereby illustrating LUN’s local geometry. A theoretical proof is given to theoretically validate B-NMF’s convergence ability. The regularization hyperparameters are selected by validating model modularity, thereby guaranteeing B-NMF’s practicability in addressing real application issues. Extensive experimental results on ten LUNs from real applications demonstrate that the proposed B-NMF-based community detector significantly outperforms several baseline and state-of-the-art models in achieving highly-accurate community detection results. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —LUNs are very-commonly seen in real applications like a social network system. Communities in LUNs are vital for various knowledge discovery-related applications. For accurately detecting them, a detector should guarantee its high representation learning ability to a target LUN. To do so, this paper presents a B-NMF model that is able to perform precise representation learning to LUNs, thereby achieving accurate community detection results. In comparison with conventional Symmetric and Non-negative Matrix Factorization-based community detectors, a B-NMF-based community detector enjoys its enlarged latent feature space, which ensures its higher representation ability to a target LUN. It depends on two regularization hyperparameters, which can be selected by performing grid-search on the target LUN via its modularity evaluation. This paper gives the empirical values of B-NMF’s regularization hyperparameters based on the parametersensitivity tests on the involved experimental datasets. The proposed B-NMF model is shown to be highly suitable for addressing community detection and clustering tasks on LUNs from real applications.
Read full abstract