D-SAG: A Distributed Sort-Based Algorithm for Graph Clustering

Yaman A Saeed,Sherenaz W Al-Haj Baddar,Raed A Ismail

doi:10.1007/s13369-021-05664-x

Abstract

Graph clustering has become a mainstream branch of computing due to its necessity for solving a wide range of problems nowadays. Thus, harnessing the capabilities of parallel and distributed computing has become instrumental. In this work, we introduce SAG, a quasilinear sort-based algorithm for graph clustering that maps naturally to distributed and/or parallel architectures. The main idea behind SAG is that nodes within a cluster naturally have similar adjacent nodes. Experiments on graphs with varying sizes compared SAG to its distributed counter-part, D-SAG, in terms of execution time, space, speedup, efficiency, and cost. Results showed the superiority of D-SAG in terms of execution time for graphs with more than 0.2 × 106 nodes. Moreover, the best speedup D-SAG achieved was 3.7-fold for synthetic graphs and 3.96-fold for real-world graphs, both using 6 computers.

Full Text