Community detection based on joint matrix factorization in networks with node attributes

Chang Zhen-Chao Chang Zhen-Chao,Chen Hong-Chang Chen Hong-Chang,Liu Yang Liu Yang,Yu Hong-Tao Yu Hong-Tao,Huang Rui-Yang Huang Rui-Yang

doi:10.7498/aps.64.218901

Abstract

An important problem in the area of social networking is the community detection. In the problem of community detection, the goal is to partition the network into dense regions of the graph. Such dense regions typically correspond to entities which are closely related with each other, and can hence be said to belong to a community. Detecting communities is of great importance in computing biology and sociology networks. There have been lots of methods to detect community. When detecting communities in social media networks, there are two possible sources of information one can use: the network link structure, and the features and attributes of nodes. Nodes in social media networks have plenty of attributes information, which presents unprecedented opportunities and flexibility for the community detection process. Some community detection algorithms only use the links between the nodes in order to determine the dense regions in the graph. Such methods are typically based purely on the linkage structure of the underlying social media network. Some other community detection algorithms may utilize the nodes' attributes to cluster the nodes, i.e. which nodes with the same attributes would be put into the same cluster. While traditional methods only use one of the two sources or simple linearly combine the results of community detection based on different sources, they cannot detect community with node attributes effectively. In recent years, matrix factorization (MF) has received considerable interest from the data mining and information retrieval fields. MF has been successfully applied in document clustering, image representation, and other domains. In this paper, we use nodes attributes as a better supervision to the community detection process, and propose an algorithm based on joint matrix factorization (CDJMF). Our method is based on the assumption that the two different information sources of linkage and node attributes can get an identical nodes' affiliation matrix. This assumption is reasonable and can interpret the inner relationship between the two different information sources, based on which the performance of community detection can be greatly improved. We also conduct some experiments on three different real social networks; theoretical analysis and numerical simulation results show that our approach can get a superior performance than some classical algorithms, so our method is an effective way to explore community structure of social networks.

Full Text