A Bayesian Inference Method Using Monte Carlo Sampling for Estimating the Number of Communities in Bipartite Networks

Hu‐Chen Liu ,Guo-Zheng Wang ,Li Xiong

doi:10.1155/2019/9471201

Hu‐Chen Liu , Guo-Zheng Wang + Show 1 more

Open Access

https://doi.org/10.1155/2019/9471201

Copy DOI

Abstract

Community detection is an important analysis task for complex networks, including bipartite networks, which consist of nodes of two types and edges connecting only nodes of different types. Many community detection methods take the number of communities in the networks as a fixed known quantity; however, it is impossible to give such information in advance in real-world networks. In our paper, we propose a projection-free Bayesian inference method to determine the number of pure-type communities in bipartite networks. This paper makes the following contributions: (1) we present the first principle derivation of a practical method, using the degree-corrected bipartite stochastic block model that is able to deal with networks with broad degree distributions, for estimating the number of pure-type communities of bipartite networks; (2) a prior probability distribution is proposed over the partition of a bipartite network; (3) we design a Monte Carlo algorithm incorporated with our proposed method and prior probability distribution. We give a demonstration of our algorithm on synthetic bipartite networks including an easy case with a homogeneous degree distribution and a difficult case with a heterogeneous degree distribution. The results show that the algorithm gives the correct number of communities of synthetic networks in most cases and outperforms the projection method especially in the networks with heterogeneous degree distributions.

Highlights

A bipartite network is a network with nodes of two types and edges connecting only nodes of different types. e decomposition of bipartite networks into communities, i.e., community detection, plays an important role in revealing the structure of large networked systems, providing new insights into how the network is organized [1,2,3,4].Many methods [5,6,7,8,9] have been developed for community detection in bipartite networks in recent years
We demonstrate our method on synthetic bipartite networks including an easy case with homogeneous degree distributions and a difficult case with a heterogeneous degree distribution. e results show that the proposed algorithm can determine the correct number of communities and perform better than our projection method in every case
We present the first principle derivation of a practical method, using the degree-corrected bipartite stochastic block model that is able to deal with networks with broad degree distributions, for estimating the number of pure-type communities of bipartite networks

Summary

Introduction

A bipartite network is a network with nodes of two types and edges connecting only nodes of different types. e decomposition of bipartite networks into communities (clusters, modules, or groups), i.e., community detection, plays an important role in revealing the structure of large networked systems, providing new insights into how the network is organized [1,2,3,4].Many methods [5,6,7,8,9] have been developed for community detection in bipartite networks in recent years. A fundamental shortcoming of most community detection methods is that they partition networks into a fixed number of groups. This number is usually unknown in realworld networks, and we need to mine such information from the network data. Ere are three main problems in these methods One is that they performed estimation through maximizing the modularity proposed in [10, 14] that is proved to be NP-hard [15, 16]; the second is that they gave the number of communities of mixed-type, which is nearly always substantially less efficient [6]; the third is that the projection method [15] performed poorly due to information loss. E heuristic methods proposed in [17, 18] for community detection in bipartite networks does not need the number of communities to be given a priori One is that they performed estimation through maximizing the modularity proposed in [10, 14] that is proved to be NP-hard [15, 16]; the second is that they gave the number of communities of mixed-type, which is nearly always substantially less efficient [6]; the third is that the projection method [15] performed poorly due to information loss. e heuristic methods proposed in [17, 18] for community detection in bipartite networks does not need the number of communities to be given a priori

Objectives

Methods

Conclusion