Abstract

We propose a greedy separation algorithm that finds the most fitted candidate among stochastic block models for a network, based on three known approaches. The first approach tests whether the network has one or more than two communities based on the distribution of the largest eigenvalue of the adjacency matrix. The second uses the algorithm to infer the classed label of each node in the network. The third approach ascertains the optimal number of clusters using an information criterion based on the Bayesian information criterion. The algorithm combined with the above approaches can find a suitable candidate from successively generated stochastic block models. However, in the second approach, the estimated labels heavily depend on the initial labels. The collection of the hub node and its neighbors is expected to construct one class with the same label. We find the hub nodes and enhance the initial labels by the PageRank method. We also conduct experiments with real data to evaluate the accuracy of the proposed method by comparing it with Markov chain Monte Carlo methods. The greedy separation algorithm with the PageRank method is preferable to the Monte Carlo-based methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call