A unified framework of semi-supervised community detection integrating network topology and node content

Jinxin Cao,Weizhong Xu,Di Jin,Xiaofeng Zhang,Lu Liu,Anthony Miller,Zhenquan Shi,Weiping Ding

doi:10.1016/j.ins.2024.121349

Jinxin Cao, Weizhong Xu + Show 6 more

https://doi.org/10.1016/j.ins.2024.121349

Copy DOI

Export

Save

Cite

Journal: Information Sciences

Publication Date: Aug 15, 2024

Abstract
Full-Text
Similar Papers

Abstract

Listen

Detecting the community structure within networks is important because it aids in the analysis of complex networks. However, the existence of diverse complex structures in the network topology can limit the topological information’s ability to represent communities. For instance, the accuracy of many traditional methods based on topology alone decreases sharply when a link is missing or noisy. To solve the above-mentioned issues simultaneously, we propose a unified framework of semi-supervised community detection integrating content information. Three models are proposed, instead of a special model-handling problem. The framework used performs the same interpretation for a target problem. The inputs contain the adjacent and node content matrix. The proposed models describe the network topology using a non-negative matrix factorization (NMF) algorithm–based generative model or modularity maximization algorithm. Additionally, we use the equivalence theory of the pLSI, and NMF models the content information based on NMF. Community membership is also used as the relationship between communities and node contents. The models uniformly integrate content information and network topology. Further, a constraint matrix that depicts must-links is built by calculating the similarity of neighbors between any two nodes. Must-link pairs can represent the network topology’s prior information. The models’ community memberships are adjusted by using a strong constraint based on the connected component of this matrix. The models also introduce prior information into the topological and content information in the mentioned models. We compared the proposed models with the state-of-the-art community detection models on artificial networks and five real networks. The experimental results show that the proposed models outperformed the models combining topology and content and semi-supervised models. Fusing the topology with the content and the prior information in the models was also effective for community detection. Thus, the models under the framework are more advantageous than other semi-supervised models when incorporating the same amount of prior information into the models. The semi-supervised framework used also has a certain level of competitiveness in detecting communities.

Full Text