Abstract
Exponential-family random graph models are probabilistic network models that are parametrized by sufficient statistics based on structural (i.e., graph-theoretic) properties. The ergm package for the R statistical computing environment is a collection of tools for the analysis of network data within an exponential-family random graph model framework. Many different network properties can be employed as sufficient statistics for exponential- family random graph models by using the model terms defined in the ergm package; this functionality can be expanded by the creation of packages that code for additional network statistics. Here, our focus is on the addition of statistics based on graphlets. Graphlets are classes of small, connected, induced subgraphs that can be used to describe the topological structure of a network. We introduce an R package called ergm.graphlets that enables the use of graphlet properties of a network within the ergm package of R. The ergm.graphlets package provides a complete list of model terms that allows to incorporate statistics of any 2-, 3-, 4- and 5-node graphlets into exponential-family random graph models. The new model terms of the ergm.graphlets package enable both exponential-family random graph modeling of global structural properties and investigation of relationships between node attributes (i.e., covariates) and local topologies around nodes.
Highlights
A graph is a representation of a set of objects and the relations among them
The degree of a node can likewise be generalized into a 73-dimensional vector in which each coordinate represents the number of graphlets that the node touches at a particular orbit; this vector is called the graphlet degree vector (GDV), or the graphlet signature of the node (Milenkovic and Przulj 2008)
While inference for complex, highly dependent systems is difficult under the best of conditions, the generative nature of the Exponential-family random graph models (ERGMs) framework allows us to assess the adequacy of our models by comparison to features of the original data; given that we have identified a model that is both sensible and that successfully regenerates the important properties of the observed network, we have a stronger basis for subsequent investigation than would be obtained from simple rejection of a null hypothesis
Summary
A graph (or network ) is a representation of a set of objects and the relations among them. Graphlets are small, connected, and non-isomorphic induced subgraphs of a graph that have n ≥ 2 nodes (Przulj et al 2004). The first technique uses the number of occurrences of each graphlet in the graph: the 30-dimensional vector obtained by counting the occurrences of each order ≤ 5 graphlet in the network is used for describing the topological structure of the graph (Przulj et al 2004). The graphlet degree distribution is used in a similar manner to the degree distribution to evaluate the topological characteristics of the whole network. The degree of a node can likewise be generalized into a 73-dimensional vector (again for order ≤ 5 graphlets) in which each coordinate represents the number of graphlets that the node touches at a particular orbit; this vector is called the graphlet degree vector (GDV), or the graphlet signature of the node (Milenkovic and Przulj 2008). Various network models describe different rules for formation of edges; e.g., Erdos-Renyi random graph models ( known as Bernoulli graphs)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.