Efficient computation of the transitive closure size

Xian Tang,Xiang Liu,Ziyang Chen,Kai Li

doi:10.1007/s10586-018-2278-9

Abstract

Given a directed graph and a node u, the transitive closure (TC) of u(TC(u)) is the set of nodes that u can reach in the graph, and the size of u’s transitive closure is to get the number of nodes in TC(u). Considering that computing the size of TC (TC-size computation or detection) is very important in many applications, and existing approaches on TC-size computation cannot scale to large graphs appeared in recent applications, we propose a path decomposition based algorithm, namely buTC, for TC-size computation with linear space complexity. buTC first gets a set of paths by path decomposition, then process nodes in each path in bottom-up manner. The benefit of buTC is that it can utilize the reachable relationship between nodes in a path to avoid the repeatedly visiting of many nodes, such that to guarantee that after processing all nodes in a single path, all involved edges are visited only once. Considering that buTC processes each path independently, we further propose an optimized algorithm, namely buTC+, to avoid the redundant operation of buTC. buTC+ does not need to do path decomposition, it uses a stack to buffer processed nodes with unprocessed in-neighbors, such that to utilize the reachable relationship between nodes in different paths of buTC to avoid the redundant computation. We conduct rich experiment on 26 real datasets and 5 synthetic datasets. The experimental results confirm that both buTC and buTC+ has better space and time scalability, and can be scaled to large graphs.

Full Text