Abstract

Networks are real systems modelled through mathematical objects made up of nodes and links arranged into peculiar and deliberate (or partially deliberate) topologies. Studying these real-world topologies allows for several properties of interest to be revealed. In real networks, nodes are also identified by a certain number of non-structural features or metadata. Given the current possibility of collecting massive quantity of such metadata, it becomes crucial to identify automatically which are the most relevant for the observed structure. We propose a new method that, independently from the network size, is able to not only report the relevance of binary node metadata, but also rank them. Such a method can be applied to networks from any domain, and we apply it in two heterogeneous cases: a temporal network of technology transfer and a protein-protein interaction network. Together with the relevance of node metadata, we investigate the redundancy of these metadata displaying by the results on a Redundancy-Relevance diagram, which is able to highlight the differences among vectors of metadata from both a structural and a non-structural point of view. The obtained results provide insights of a practical nature into the importance of the observed node metadata for the actual network structure.

Highlights

  • Networks are used to model interactions across a number of different fields, including social sciences, biology, information technology and engineering

  • When we have several node metadata referring to the nodes of a single network, we should take into account two aspects: www.nature.com/scientificreports i) The comparison of a certain configuration with the related degeneracy area and boundary of the phase diagram may be unfeasible due to computational issues

  • When we aim to evaluate the relevance of a certain set of metadata, we should take into account these two aspects together with the following consideration: the H–D space is asymmetrical with a unique pivotal point represented by H = D = 1 and each of its four internal regions has a different size and meaning, as explained in the previous Section

Read more

Summary

Introduction

Networks are used to model interactions across a number of different fields, including social sciences, biology, information technology and engineering. The investigation of the relationship between certain binary node metadata and the network topology was performed initially by examining the correlation of the considered binary features across the network edges via the assortativity coefficient[3]. This coefficient, doesn’t take into account the microscopic nature of interaction and is preferred in the case of multiple discrete node characteristics or scalar characteristics (like the node degrees). In the case of binary node metadata a more detailed approach can be pursued, especially considering that the different link types (called dyads) can be represented in a two-dimensional space Such approach has been already done, considering undirected networks, in terms of the dyadic effect[4]. Through the observation of the dyadic effect, two measures, called dyadicity D and heterophilicity H, separately denote homogeneous and heterogeneous assortment with respect to a certain binary metadata and measure the degree to which such node metadata correlate with the structure of the network

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call