Abstract
The stochastic block model (SBM) is a probabilistic model for community structure in networks. Typically, only the adjacency matrix is used to perform SBM parameter inference. In this paper, we consider circumstances in which nodes have an associated vector of continuous attributes that are also used to learn the node-to-community assignments and corresponding SBM parameters. Our model assumes that the attributes associated with the nodes in a network’s community can be described by a common multivariate Gaussian model. In this augmented, attributed SBM, the objective is to simultaneously learn the SBM connectivity probabilities with the multivariate Gaussian parameters describing each community. While there are recent examples in the literature that combine connectivity and attribute information to inform community detection, our model is the first augmented stochastic block model to handle multiple continuous attributes. This provides the flexibility in biological data to, for example, augment connectivity information with continuous measurements from multiple experimental modalities. Because the lack of labeled network data often makes community detection results difficult to validate, we highlight the usefulness of our model for two network prediction tasks: link prediction and collaborative filtering. As a result of fitting this attributed stochastic block model, one can predict the attribute vector or connectivity patterns for a new node in the event of the complementary source of information (connectivity or attributes, respectively). We also highlight two biological examples where the attributed stochastic block model provides satisfactory performance in the link prediction and collaborative filtering tasks.
Highlights
Uncovering patterns in network data is a common pursuit across a range of fields, such as in biology (Larremore et al 2013), medicine (Aghaeepour et al 2017; Guinney et al 2015) and computational social science (Greene and Cunningham 2013)
Our objectives are three-fold: first, we develop a probabilistic approach to jointly model connectivity and attributes; second, we wish to ensure that our model can handle multiple, continuous attributes; third, we demonstrate the utility of the fitted model for link prediction and collaborative filtering applications
Our model extends previous work on attributed stochastic block models in its ability to handle multiple continuous attributes
Summary
Uncovering patterns in network data is a common pursuit across a range of fields, such as in biology (Larremore et al 2013), medicine (Aghaeepour et al 2017; Guinney et al 2015) and computational social science (Greene and Cunningham 2013). Hric et al (2016) developed an attributed SBM from a multilayer network perspective, with one layer modeling relational information between attributes and the other modeling connectivity, assigning nodes to communities maximizing the likelihood of the observed data in each layer. Since community detection methods are often difficult to validate due to the lack of ground truth information on the nodes, we describe the tasks of link prediction and collaborative filtering to quantify how well the attributed SBM represents the data We consider these tasks on two biological network examples. To fit this model to network data, the objective is to partition the nodes into communities such that these assignments maximize the likelihood of the model according to the observed edges In this inference problem for a network with N nodes and K communities, one learns a K × K probability matrix, θ , describes the probability of connections within and between communities, and an N-length vector of node-to-community assignments, z. Effective inference techniques for standard stochastic block model parameters are well explored (Zhang et al 2012; Peixoto 2014; Daudin et al 2008), including algorithms for EM, belief propagation, and MCMC accept-reject sampling
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.