Abstract

Network data often arises via a series of structured interactions among a population of constituent elements. E-mail exchanges, for example, have a single sender followed by potentially multiple receivers. Scientific articles, on the other hand, may have multiple subject areas and multiple authors. We introduce a statistical model, termed the Pitman-Yor hierarchical vertex components model (PY-HVCM), that is well suited for structured interaction data. The proposed PY-HVCM effectively models complex relational data by partial pooling of local information via a latent, shared population-level distribution. The PY-HCVM is a canonical example of hierarchical vertex components models—a subfamily of models for exchangeable structured interaction-labeled networks, that is, networks invariant to interaction relabeling. Theoretical analysis and supporting simulations provide clear model interpretation, and establish global sparsity and power law degree distribution. A computationally tractable Gibbs sampling algorithm is derived for inferring sparsity and power law properties of complex networks. We demonstrate the model on both the Enron e-mail dataset and an ArXiv dataset, showing goodness of fit of the model via posterior predictive validation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call