Abstract

We present KADABRA, a new algorithm to approximate betweenness centrality in directed and undirected graphs, which significantly outperforms all previous approaches on real-world complex networks. The efficiency of the new algorithm relies on two new theoretical contributions, of independent interest. The first contribution focuses on sampling shortest paths, a subroutine used by most algorithms that approximate betweenness centrality. We show that, on realistic random graph models, we can perform this task in time | E | 1/2+ o (1) with high probability, obtaining a significant speedup with respect to the Θ(| E |) worst-case performance. We experimentally show that this new technique achieves similar speedups on real-world complex networks, as well. The second contribution is a new rigorous application of the adaptive sampling technique. This approach decreases the total number of shortest paths that need to be sampled to compute all betweenness centralities with a given absolute error, and it also handles more general problems, such as computing the k most central nodes. Furthermore, our analysis is general, and it might be extended to other settings.

Highlights

  • In this work we focus on estimating the betweenness centrality, which is one of the most famous measures of centrality for nodes and edges of real-world complex networks [19, 30]

  • We propose a new and faster algorithm to approximate betweenness centrality in directed and undirected graphs, named KADABRA

  • To speed up the estimation of the betweenness centrality, previous work make use of the technique of adaptive sampling, which consists in testing during the execution of the algorithm whether some condition on the sample obtained so far has been met, and terminating the execution of the algorithm as soon as this happens

Read more

Summary

Introduction

In this work we focus on estimating the betweenness centrality, which is one of the most famous measures of centrality for nodes and edges of real-world complex networks [19, 30]. To speed up the estimation of the betweenness centrality, previous work make use of the technique of adaptive sampling, which consists in testing during the execution of the algorithm whether some condition on the sample obtained so far has been met, and terminating the execution of the algorithm as soon as this happens This technique introduces a subtle stochastic dependence between the time in which the algorithm terminates and the correctness of the given output, which previous papers claiming a formal analysis of the technique did not realize (see Section 3 for details). With an argument based on martingale theory, we provide a general analysis of such useful technique Through this result, we do improve previous estimators, but we make it possible to define more general stopping conditions, that can be decided “on the fly”: this way, with little modifications, we can adapt our algorithm to perform more general tasks than previous ones. Our algorithm can be adapted to compute the centrality of edges, to handle weighted graphs and, since its core part consists merely in sampling paths, we conjecture that it may be coupled with the existing techniques in [8] to handle dynamic graphs

Related Work
Algorithm Overview
Proof of Theorem 1
Adaptive Sampling
Balanced Bidirectional BFS
Analysis on Random Graph
Computing the k Most Central Nodes
Experimental Results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.