Motif Counting Beyond Five Nodes

Marco Bressan,Flavio Chierichetti,Ravi Kumar,Stefano Leucci,Alessandro Panconesi

doi:10.1145/3186586

Abstract

Counting graphlets is a well-studied problem in graph mining and social network analysis. Recently, several papers explored very simple and natural algorithms based on Monte Carlo sampling of Markov Chains (MC), and reported encouraging results. We show, perhaps surprisingly, that such algorithms are outperformed by color coding (CC) [2], a sophisticated algorithmic technique that we extend to the case of graphlet sampling and for which we prove strong statistical guarantees. Our computational experiments on graphs with millions of nodes show CC to be more accurate than MC; furthermore, we formally show that the mixing time of the MC approach is too high in general, even when the input graph has high conductance. All this comes at a price however. While MC is very efficient in terms of space, CC’s memory requirements become demanding when the size of the input graph and that of the graphlets grow. And yet, our experiments show that CC can push the limits of the state-of-the-art, both in terms of the size of the input graph and of that of the graphlets.

Highlights

Counting graphlets is a well-studied problem in graph mining and social-networks analysis [1, 3, 7, 8, 11, 14, 18, 20, 27,28,29, 32]
We show that even a single run of color coding (CC), whose output can be seen as a large sample of the population of graphlets, gives reasonably good statistical guarantees
We note that Pairwise Subgraph Random walk (PSRW) has been developed with the primary goal of minimizing the number of nodes of G visited by the walk; in the present article, we investigate it in terms of samples taken, running time, and accuracy

Summary

Introduction

Counting graphlets is a well-studied problem in graph mining and social-networks analysis [1, 3, 7, 8, 11, 14, 18, 20, 27,28,29, 32]. The problem asks to count the frequencies of all induced connected subgraphs (called graphlets), up to isomorphism, of a certain size. Understanding the distribution of graphlets allows us to make key inferences about the structural properties of the underlying graph and the interaction of the nodes in the graph (e.g., [22]) It sheds light on the type of local structures that are present in the graph, which can be used for a myriad of analysis [3, 8, 16, 27,28,29]. How the graphlets form in the first place and how they temporally evolve are semantically more actionable than the interpretation yielded by the mere evolution of nodes and edges

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ACM transactions on knowledge discovery from data	Publication Date: Apr 16, 2018
Citations: 41	License type: cc-by

R Discovery Prime

R Discovery Prime

Motif Counting Beyond Five Nodes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ACM transactions on knowledge discovery from data

Lead the way for us

Similar Papers

Counting Graphlets
Marco Bressan ... Stefano Leucci
-
Marco Bressan, et. al.Marco Bressan ... Stefano Leucci
02 Feb 2017
02 Feb 2017

Efficient $k-\text{clique}$ Listing with Set Intersection Speedup
Zhirong Yuan ... Li Han
-
Zhirong Yuan, et. al.Zhirong Yuan ... Li Han
01 May 2022
01 May 2022

Graph Clustering Based on Attribute-Aware Graph Embedding
Esra Akbas ... Peixiang Zhao
-
Esra Akbas, et. al.Esra Akbas ... Peixiang Zhao
01 Jan 2019
01 Jan 2019

Knowledge representation analysis of graph mining
Matthias Van Der Hallen ... Marc Denecker
Annals of Mathematics and Artificial Intelligence | VOL. 86
Matthias Van Der Hallen, et. al.Matthias Van Der Hallen ... Marc Denecker
28 Mar 2019
Annals of Mathematics and Artificial Intelligence | VOL. 86

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Motif Counting Beyond Five Nodes

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ACM transactions on knowledge discovery from data