Abstract

Graph distance and graph embedding are two fundamental tasks in graph mining. For graph distance, determining the structural dissimilarity between networks is an ill-defined problem, as there is no canonical way to compare two networks. Indeed, many of the existing approaches for network comparison differ in their heuristics, efficiency, interpretability, and theoretical soundness. Thus, having a notion of distance that is built on theoretically robust first principles and that is interpretable with respect to features ubiquitous in complex networks would allow for a meaningful comparison between different networks. For graph embedding, many of the popular methods are stochastic and depend on black-box models such as deep networks. Regardless of their high performance, this makes their results difficult to analyze which hinders their usefulness in the development of a coherent theory of complex networks. Here we rely on the theory of the length spectrum function from algebraic topology, and its relationship to the non-backtracking cycles of a graph, in order to introduce two new techniques: Non-Backtracking Spectral Distance (NBD) for measuring the distance between undirected, unweighted graphs, and Non-Backtracking Embedding Dimensions (NBED) for finding a graph embedding in low-dimensional space. Both techniques are interpretable in terms of features of complex networks such as presence of hubs, triangles, and communities. We showcase the ability of NBD to discriminate between networks in both real and synthetic data sets, as well as the potential of NBED to perform anomaly detection. By taking a topological interpretation of non-backtracking cycles, this work presents a novel application of topological data analysis to the study of complex networks.

Highlights

  • As the network science literature continues to expand and scientists compile more examples of real life networked data sets coming from an ever growing range of domains (Clauset et al.; Kunegis 2013), there is a need to develop methods to compare complex networks, both within and across domains

  • We have focused on the problem of deriving a notion of graph distance for complex networks based on the length spectrum function

  • We add to the repertoire of distance methods the Non-Backtracking Spectral Distance (NBD): a principled, interpretable, computationally efficient, and effective technique that takes advantage of the fact that one can interpret the non-backtracking cycles of a graph as its free homotopy classes

Read more

Summary

Introduction

As the network science literature continues to expand and scientists compile more examples of real life networked data sets coming from an ever growing range of domains (Clauset et al.; Kunegis 2013), there is a need to develop methods to compare complex networks, both within and across domains. We present an efficient algorithm to compute the non-backtracking matrix, and discuss the data visualization capabilities of its complex eigenvalues (see Fig. 1) and eigenvectors (Fig. 12). “Operationalizing the length spectrum” section explains the connection between these objects, as well as a discussion of the properties of the non-backtracking matrix that make it relevant for the study of complex networks. “NBD: Non-backtracking distance” section presents our distance method NBD and provides experimental evidence of its performance by comparing it to other distance techniques. In “NBED: Non-backtracking embedding dimensions” section we discuss our embedding method NBED based on the eigenvectors of the non-backtracking matrix and provide extensive visual analysis of the resulting edge embeddings, as well as mention its shortcomings and necessary future lines of study. We conclude in “Discussion and conclusions” section with a discussion of limitations and future work

Background
Discussion and conclusions
Findings
Limitations

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.