Abstract

Frequent pattern mining (FPM) has played an important role in many graph domains, such as bioinformatics and social networks. In this paper, we focus on geo-social graphs, a kind of social network augmented by geographical information. However, in addition to the exponential time complexity of the problem, we face the challenge of efficient subgraph retrieval since we are interested in patterns in a specific region in such a network. For this reason, we formulate the top-k FPM problem in large geo-social networks. Specifically, we devise a novel framework for subgraph retrieval and FPM mining with a series of optimizations. First, we propose a neighboring-aware R-tree (NaR-Tree) index structure to alleviate the challenge of retrieving subgraphs from a large graph. NaR-Tree is a variant of R-tree in which each nonleaf tree node further maintains some edge statistics information for the rectangle related to it. Second, we define the concept of minimum image-based support of edges (MNIE). With the help of the NaR-Tree and MNIE-based pattern extension approach, a mining algorithm that addresses the problem of exponential candidate patterns is proposed. We also present a lazy retrieval strategy to reduce the frequency of subgraph retrieval. Finally, we adopt an edge sampling approach to further accelerate the mining process. Extensive experiments on real-world and synthesized datasets are conducted to demonstrate the effectiveness and efficiency of our solution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call