Abstract

Mixture model and stochastic block model (SBM) for structure discovery employ a broad and flexible definition of vertex classes such that they are able to explore a wide variety of structure. Compared to the existing algorithms based on the SBM (their time complexities are O(mc2), where m and c are the number of edges and clusters), the algorithms of mixture model are capable of dealing with networks with a large number of communities more efficiently due to their O(mc) time complexity. However, the algorithms of mixture model using expectation maximization (EM) technique are still too slow to deal with real million-node networks, since they compute hidden variables on the entire network in each iteration. In this paper, an online variational EM algorithm is designed to improve the efficiency of the EM algorithms. In each iteration, our online algorithm samples a node and estimates its cluster memberships only by its adjacency links, and model parameters are then estimated by the memberships of the sampled node and old model parameters obtained in the previous iteration. The provided online algorithm updates model parameters subsequently by the links of a new sampled node and explores the general structure of massive and growing networks with millions of nodes and hundreds of clusters in hours. Compared to the relevant algorithms on synthetic and real networks, the proposed online algorithm costs less with little or no degradation of accuracy. Results illustrate that the presented algorithm offers a good trade-off between precision and efficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.