Sampling of Attributed Networks from Hierarchical Generative Models

Pablo Robles,Jennifer Neville,Sebastian Moreno

doi:10.1145/2939672.2939808

Abstract

Network sampling is a widely used procedure in social network analysis where a random network is sampled from a generative network model (GNM). Recently proposed GNMs, allow generation of networks with more realistic structural characteristics than earlier ones. This facilitates tasks such as hypothesis testing and sensitivity analysis. However, sampling of networks with correlated vertex attributes remains a challenging problem. While the recent work of \cite{Pfeiffer:14} has provided a promising approach for attributed-network sampling, the approach was developed for use with relatively simple GNMs and does not work well with more complex hierarchical GNMs (which can model the range of characteristics and variation observed in real world networks more accurately). In contrast to simple GNMs where the probability mass is spread throughout the space of edges more evenly, hierarchical GNMs concentrate the mass to smaller regions of the space to reflect dependencies among edges in the network---this produces more realistic network characteristics, but also makes it more difficult to identify candidate networks from the sampling space. In this paper, we propose a novel sampling method, CSAG, to sample from hierarchical GNMs and generate networks with correlated attributes. CSAG constrains every step of the sampling process to consider the structure of the GNM---in order to bias the search to regions of the space with higher likelihood. We implemented CSAG using mixed Kronecker Product Graph Models and evaluated our approach on three real-world datasets. The results show that CSAG jointly models the correlation and structure of the networks better than the state of the art. Specifically, CSAG maintains the variability of the underlying GNM while providing a ≥ 5X reduction in attribute correlation error.

Full Text