Abstract
Breadth-First Search is an important kernel used by many graph-processing applications. In many of these emerging applications of BFS, such as analyzing social networks, the input graphs are low-diameter and scale-free. We propose a hybrid approach that is advantageous for low-diameter graphs, which combines a conventional top-down algorithm along with a novel bottom-up algorithm. The bottom-up algorithm can dramatically reduce the number of edges examined, which in turn accelerates the search as a whole. On a multi-socket server, our hybrid approach demonstrates speedups of 3.3–7.8 on a range of standard synthetic graphs and speedups of 2.4–4.6 on graphs from real social networks when compared to a strong baseline. We also typically double the performance of prior leading shared memory (multicore and GPU) implementations.
Highlights
Graph algorithms are becoming increasingly important, with applications covering a wide range of scales
Breadth-First Search (BFS), an important building block in many other graph algorithms, has low computational intensity, which exacerbates the lack of locality and results in low overall performance
Breadth-First Search (BFS) is an important building block of many graph algorithms, and it is commonly used to test for connectivity or compute the singlesource shortest paths of unweighted graphs
Summary
Graph algorithms are becoming increasingly important, with applications covering a wide range of scales. Breadth-First Search (BFS), an important building block in many other graph algorithms, has low computational intensity, which exacerbates the lack of locality and results in low overall performance. To accelerate BFS, there has been significant prior work to change the algorithm and data structures, in some cases by adding additional computational work, to increase locality and boost overall performance [1,9,16, 27]. None of these previous schemes attempt to reduce the number of edges examined. An early version of this algorithm [5] running on a stock quad-socket Intel server was ranked 17th in the Graph500 November 2011 rankings [15], achieving the fastest single-node implementation and the highest per-core processing rate, and outperforming specialized architectures and clusters with more than 150 sockets
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.