Maximal cliques are elementary substructures in a graph and instrumental in graph analysis such as the structural analysis of many complex networks, graph clustering and community detection, network hierarchy detection, emerging pattern mining, vertex importance measures, etc. However, the number of maximal cliques is also notoriously large even for many small real world graphs. This size problem gives rise to challenges in both computing and managing the set of maximal cliques. Many algorithms for computing maximal cliques have been proposed in the literature; however, most of them are sequential algorithms that cannot scale due to the high complexity of the problem, while existing parallel algorithms for computing maximal cliques are mostly immature and especially suffer from skewed workload. As for managing the set of maximal cliques, which is essential due to its large size, there is barely any efficient method for querying or updating the set of maximal cliques. In this paper, we first propose a distributed algorithm built on a share-nothing architecture for computing the set of maximal cliques. We effectively address the problem of skewed workload distribution due to high-degree vertices, which also leads to drastically reduced worst-case time complexity for computing maximal cliques in common real-world graphs. Then, we propose a set of fundamental query operations and efficient algorithms to process the queries, to aid more efficient and effective analysis of the set of maximal cliques. Finally, we also devise algorithms to support efficient update maintenance of the set of maximal cliques when the underlying graph is updated. We verify the efficiency of our algorithms for computing, querying, and updating the set of maximal cliques with a range of real-world graphs from different application domains.
Read full abstract