Triangle counting is an important problem in graph mining. Two frequently used metrics in complex network analysis that require the count of triangles are the clustering coefficients and the transitivity ratio of the graph. Triangles have been used successfully in several real-world applications, such as detection of spamming activity, uncovering the hidden thematic structure of the web and link recommendation in online social networks. Furthermore, the count of triangles is a frequently used network statistic in exponential random graph models. However, counting the number of triangles in a graph is computationally expensive. In this paper, we propose the EigenTriangle and EigenTriangleLocal algorithms to estimate the number of triangles in a graph. The efficiency of our algorithms is based on the special spectral properties of real-world networks, which allow us to approximate accurately the number of triangles. We verify the efficacy of our method experimentally in almost 160 experiments using several Web Graphs, social, co-authorship, information, and Internet networks where we obtain significant speedups with respect to a straightforward triangle counting algorithm. Furthermore, we propose an algorithm based on Fast SVD which allows us to apply the core idea of the EigenTriangle algorithm on graphs which do not fit in the main memory. The main idea is a simple node-sampling process according to which node i is selected with probability $${\frac{d_i}{2m}}$$ where d i is the degree of node i and m is the total number of edges in the graph. Our theoretical contributions also include a theorem that gives a closed formula for the number of triangles in Kronecker graphs, a model of networks which mimics several properties of real-world networks.
Read full abstract