Abstract
Frequent pattern mining (FPM) has played an important role in many graph domains, such as bioinformatics and social networks. In this paper, we focus on geo-social graphs, a kind of social network augmented by geographical information. However, in addition to the exponential time complexity of the problem, we face the challenge of efficient subgraph retrieval since we are interested in patterns in a specific region in such a network. For this reason, we formulate the top-k FPM problem in large geo-social networks. Specifically, we devise a novel framework for subgraph retrieval and FPM mining with a series of optimizations. First, we propose a neighboring-aware R-tree (NaR-Tree) index structure to alleviate the challenge of retrieving subgraphs from a large graph. NaR-Tree is a variant of R-tree in which each nonleaf tree node further maintains some edge statistics information for the rectangle related to it. Second, we define the concept of minimum image-based support of edges (MNIE). With the help of the NaR-Tree and MNIE-based pattern extension approach, a mining algorithm that addresses the problem of exponential candidate patterns is proposed. We also present a lazy retrieval strategy to reduce the frequency of subgraph retrieval. Finally, we adopt an edge sampling approach to further accelerate the mining process. Extensive experiments on real-world and synthesized datasets are conducted to demonstrate the effectiveness and efficiency of our solution.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.