Abstract

We propose a series of Bayesian nonparametric statistical models for community detection in graphs. We model the probability of the presence or absence of edges within the graph. Using these models, we naturally incorporate uncertainty and variability and take advantage of nonparametric techniques, such as the Chinese restaurant process and the Dirichlet process. Some of the contributions include: (a) the community structure is directly modeled without specifying the number of communities a priori; (b) the probabilities of edges within or between communities may be modeled as varying by community or pairs of communities; (c) some nodes can be classified as not belonging to any community; and (d) Bayesian model diagnostics are used to compare models and help with appropriate model selection. We start by fitting an initial model to a well-known network dataset, and we develop a series of increasingly complex models. We propose Markov chain Monte Carlo algorithms to carry out the estimation as well as an approach for community detection using the posterior distributions under a decision theoretical framework. Bayesian nonparametric techniques allow us to estimate the number and structure of communities from the data. To evaluate the proposed models for the example dataset, we discuss model comparison using the deviance information criterion and model checking using posterior predictive distributions. Supplementary materials are available online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call