An empirical comparison of connectivity-based distances on a graph and their computational scalability

Pierre Miasnikof,Leonidas Pitsoulis,Alexander Ponomarenko,Alexander Y Shestopaloff

doi:10.1093/comnet/cnac003

Abstract

Abstract In this study, we compare distance measures with respect to their ability to capture vertex community structure and the scalability of their computation. Our goal is to find a distance measure which can be used in an aggregate pairwise minimization clustering scheme. The minimization should lead to subsets of vertices with high induced subgraph density. Our definition of distance is rooted in the notion that vertices sharing more connections are closer to each other than vertices which share fewer connections. This definition differs from that of the geodesic distance typically used in graphs. It is based on neighbourhood overlap, not shortest path. We compare four distance measures from the literature and evaluate their accuracy in reflecting intra-cluster density, when aggregated (averaged) at the cluster level. Our tests are conducted on synthetic graphs, where clusters and intra-cluster densities are known in advance. We find that amplified commute, Otsuka–Ochiai and Jaccard distances display a consistent inverse relation to intra-cluster density. We also conclude that the computation of amplified commute distance does not scale as well to large graphs as that of the other two distances.

Full Text