Abstract

The growing availability of network data and of scientific interest in distributed systems has led to the rapid development of statistical models of network structure. Typically, however, these are models for the entire network, while the data consists only of a sampled sub-network. Parameters for the whole network, which is what is of interest, are estimated by applying the model to the sub-network. This assumes that the model is consistent under sampling, or, in terms of the theory of stochastic processes, that it defines a projective family. Focusing on the popular class of exponential random graph models (ERGMs), we show that this apparently trivial condition is in fact violated by many popular and scientifically appealing models, and that satisfying it drastically limits ERGM's expressive power. These results are actually special cases of more general results about exponential families of dependent random variables, which we also prove. Using such results, we offer easily checked conditions for the consistency of maximum likelihood estimation in ERGMs, and discuss some possible constructive responses.

Highlights

  • In recent years, the rapid increase in both the availability of data on networks and the demand, from many scientific areas, for analyzing such data has resulted in a surge of generative and descriptive models for network data [20, 47]

  • In particular we show that exponential families are projective if, and only if, their sufficient statistics decompose into separate additive contributions from disjoint observations in a nice way which we formalize in the following definition

  • In models where T has two elements, the number of edges and the number of triangles or of 2stars, the log partition function is known to scale like n(n − 1) as the number of nodes n → ∞, at least in the parameter regimes where the models behave basically like either very full or very empty Erdos–Rényi networks [9, 13, 14, 49,50,51]. (We suspect, from [14, 50, 66], that similar results apply to many other exponential random graph models (ERGMs).) by equation (18), if we fix a large number n of nodes and generate a graph X from Pθ,n, the probability that the MLE θ (X) will be more than ε away from θ will be exponentially small in n(n − 1) and ε2

Read more

Summary

Introduction

The rapid increase in both the availability of data on networks (of all kinds, but especially social ones) and the demand, from many scientific areas, for analyzing such data has resulted in a surge of generative and descriptive models for network data [20, 47]. It is worth noting that the property of having separable increments is an intrinsic property of the family {PA, }A∈A that depends only on the functional forms of the sufficient statistics {tA}A∈A and not on the model parameters θ ∈ This follows from the fact that, for any A, the probability distributions {PA,θ }θ∈ have identical support XA. If an exponential family has independent increments, TB\A ⊥⊥ TA, its joint volume factor separates, vA,B\A(t, δ) = vA(t)vB\A(δ), and the distribution of T is projective. An exponential family of stochastic processes on such a space has projective parameters if, and only if, its sufficient statistics have separable increments, and so only if they have independent increments. Assume the conditions of Theorem 1 hold, so that the parameters are projective and the sufficient statistics have (by Lemma 2) independent increments. Asymptotic scaling of the log partition function implies θ is consistent

Application
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.