Abstract

The creation of social ties is largely determined by the entangled effects of people’s similarities in terms of individual characters and friends. However, feature and structural characters of people usually appear to be correlated, making it difficult to determine which has greater responsibility in the formation of the emergent network structure. We propose AN2VEC, a node embedding method which ultimately aims at disentangling the information shared by the structure of a network and the features of its nodes. Building on the recent developments of Graph Convolutional Networks (GCN), we develop a multitask GCN Variational Autoencoder where different dimensions of the generated embeddings can be dedicated to encoding feature information, network structure, and shared feature-network information. We explore the interaction between these disentangled characters by comparing the embedding reconstruction performance to a baseline case where no shared information is extracted. We use synthetic datasets with different levels of interdependency between feature and network characters and show (i) that shallow embeddings relying on shared information perform better than the corresponding reference with unshared information, (ii) that this performance gap increases with the correlation between network and feature structure, and (iii) that our embedding is able to capture joint information of structure and features. Our method can be relevant for the analysis and prediction of any featured network structure ranging from online social systems to network medicine.

Highlights

  • The creation of social ties is largely determined by the entangled effects of people’s similarities in terms of individual characters and friends

  • In this paper we propose a contribution to solve this problem by developing a joint feature-network embedding built on multitask Graph Convolutional Networks (Kipf and Welling 2016a; Bruna et al 2013; Hamilton et al 2017; Ying et al 2018) and Variational Autoencoders (GCN-Variational autoencoder (VAE)) (Kingma and Welling 2013; Rezende et al 2014; Kipf and Welling 2016b; Wang et al 2016; Zhu et al 2018), which we call the Attributed Node to Vector method (AN2VEC)

  • This procedure maintains constant the overall count of each colour, and lets us control the correlation between the graph structure and node features by moving α from 0 to 1

Read more

Summary

Introduction

The creation of social ties is largely determined by the entangled effects of people’s similarities in terms of individual characters and friends. Such features generate homophilic tie creation preferences (McPherson et al 2001; Kossinets and Watts 2009), which induce links with higher probability between similar individuals, whom in turn form feature communities of shared interest, age, gender, or socio-economic status, and so on (Leo et al 2016; Shrum et al 1988) Though these mechanisms are not independent and lead to correlations between feature and network communities, it is difficult to define the causal relationship between the two: first, because simultaneously characterising similarities between multiple features and a complex network structure is not an easy task; second, because it is difficult to determine, which of the two types of information, features or structure, is driving network formation to a greater extent. We can identify an optimal reduced embedding, which indicates whether combined information coming from the structure and features is important, or whether their non-interacting combination is sufficient for reconstructing the featured network

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call