Community Detection Algorithm Based on Nonnegative Matrix Factorization and Improved Density Peak Clustering

Hong Lu,Xiaoshuang Sang,Qinghua Zhao,Jianfeng Lu

doi:10.1109/access.2019.2963694

Abstract

Community detection is a critical issue in the field of complex networks. Recently, the nonnegative matrix factorization (NMF) method has successfully uncovered the community structure in the complex networks. However, this method has a significant drawback; most of community detection methods using NMF require the number of communities to be preassigned or determined the number of communities by searching for the best community structure among all candidates. To address this problem, in this paper, we use density peak clustering (DPC) to obtain the number of centers as the pre-defined parameter for nonnegative matrix factorization. However, due to sparse and high dimensional characteristics of complex networks, DPC cannot be used to detect community directly. To overcome this issue, we employ degree and hop of nodes as the density and distance indexes, respectively; we use NMF and Symmetric NMF to deal with linearly separable data and non-linearly separable data, respectively. Experimental results show that the proposed methods exhibit excellent performance on artificial and real-world networks and superior to the state-of-the-art methods which are the most common method for community detection of complex networks.

Highlights

Community structure is ubiquitous in the form of networks, such as social networks [1], biological networks [2], citation networks [3], etc, which makes community detection especially crucial for better understanding the organization of networks and extracting useful information
The DPCNMF and DPCSNMF algorithms are implemented in MATLAB R2014a
The distribution of data is hard to be known in advance, which is the exact reason that we propose DPCNMF and DPCSNMF to deal with different networks

Summary

Introduction

Community structure is ubiquitous in the form of networks, such as social networks [1], biological networks [2], citation networks [3], etc, which makes community detection especially crucial for better understanding the organization of networks and extracting useful information. Since the seminal work by Girvan and Newman [1], a number of algorithms for community detection in complex networks have been proposed, such as modularity based algorithms [5], clustering based algorithms [6], [39], random walk based algorithms [7] and matrix decomposition based algorithms [8]–[11]. Despite a lot of methods proposed, determining the number of communities is an important and thorny issue in practice. Most of methods adopted maximum modularity (Q) as the criterion to determine the number of communities, such as SNMF-SS [18] proposed by Ma et al and SBMF [17] proposed by Zhang et al, which is very time-consuming. Determining the number of communities in advance is often unfeasible for concrete application and it is very inefficient to decide the number of communities (K ) by searching all possible candidates

Methods

Results

Conclusion