Abstract

A fundamental issue of network data science is the ability to discern observed features that can be expected at random from those beyond such expectations. Configuration models play a crucial role there, allowing us to compare observations against degree-corrected null-models. Nonetheless, existing formulations have limited large-scale data analysis applications either because they require expensive Monte-Carlo simulations or lack the required flexibility to model real-world systems. With the generalized hypergeometric ensemble, we address both problems. To achieve this, we map the configuration model to an urn problem, where edges are represented as balls in an appropriately constructed urn. Doing so, we obtain the generalized hypergeometric ensemble of random graphs: a random graph model reproducing and extending the properties of standard configuration models, with the critical advantage of a closed-form probability distribution.

Highlights

  • A fundamental issue of network data science is the ability to discern observed features that can be expected at random from those beyond such expectations

  • Because we only deal with multi-graphs, in the rest of the article we will refer to multi-graphs as graphs, and to multi-edges as edges

  • We introduce the definition of the hypergeometric configuration model

Read more

Summary

Introduction

A fundamental issue of network data science is the ability to discern observed features that can be expected at random from those beyond such expectations. We propose an analytically tractable model for random graphs with given expected degree sequences and a fixed number of edges. Definition 2 (Directed hypergeometric configuration model) Let kin, kout ∈ Nn be in- and out-degree sequences and Va set of n vertices.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call