Abstract

Graph sampling refers to the process of deriving a small subset of nodes from a possibly huge graph in order to estimate properties of the whole graph from examining the sample. Whereas topological properties can already be obtained accurately by sampling, current approaches do not take possibly hidden dependencies between node topology and attributes into account. Especially in the context of online social networks, node attributes are of importance as they correspond to properties of the social network’s users. Therefore, existing sampling algorithms can be extended to attribute sampling, but still lack the capturing of structural properties. Analyzing topology (e.g., node degree and clustering coefficient) and attribute properties (e.g., age and location) jointly can provide valuable insights into the social network and allows for a better understanding of social processes. As major contribution, this work proposes a novel sampling algorithm which provides unbiased and reliable estimates of joint topological and attribute based graph properties in a resource efficient fashion. Furthermore, the obtained samples allow for the generation of synthetic graphs, which show high similarity to the original graph with respect to topology and attributes. The proposed sampling and generation algorithms are evaluated on real world social network graphs, for which they demonstrate to be effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call