Abstract

Exponential random graph models (ERGMs) are widely used for modeling social networks observed at one point in time. However the computational difficulty of ERGM parameter estimation has limited the practical application of this class of models to relatively small networks, up to a few thousand nodes at most, with usually only a few hundred nodes or fewer. In the case of undirected networks, snowball sampling can be used to find ERGM parameter estimates of larger networks via network samples, and recently published improvements in ERGM network distribution sampling and ERGM estimation algorithms have allowed ERGM parameter estimates of undirected networks with over one hundred thousand nodes to be made. However the implementations of these algorithms to date have been limited in their scalability, and also restricted to undirected networks. Here we describe an implementation of the recently published Equilibrium Expectation (EE) algorithm for ERGM parameter estimation of large directed networks. We test it on some simulated networks, and demonstrate its application to an online social network with over 1.6 million nodes.

Highlights

  • Exponential random graph models (ERGMs) are a class of statistical model often used for modeling social networks [1, 2]

  • We describe an implementation of the EE algorithm, including the improved fixed density ERGM sampler [11] for application to directed networks

  • The implementation we describe allows ERGM parameter estimation for a model of a directed network with over one million nodes, while existing methods are only practical on networks of a few thousand nodes at most

Read more

Summary

Introduction

Exponential random graph models (ERGMs) are a class of statistical model often used for modeling social networks [1, 2] Parameter estimation in these models is a computationally difficult problem, and algorithms based on Markov chain Monte Carlo (MCMC) are generally used [2,3,4,5,6,7,8,9,10]. It is worth noting that the state space for a directed network is far larger than for an undirected network with the same number of nodes [13], and so this problem is even more difficult in the case of directed networks One solution to this problem is to take snowball samples [14,15,16,17,18] from the original network, and estimate ERGM parameters from these [19, 20].

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.