Graph Pattern Matching (GPM) has been used in lots of areas, like biology, medical science, and physics. With the advent of Online Social Networks (OSNs), recently, GPM has been playing a significant role in social network analysis, which has been widely used in, for example, finding experts, social community mining, and social position detection. Given a query which contains a pattern graph $G_Q$ and a data graph $G_D$ , a GPM algorithm finds those subgraphs, $G_M$ , that match $G_Q$ in $G_D$ . However, the existing GPM methods do not consider the multiple end-to-end constraints of the social contexts, like social relationships, social trust, and social positions on edges in $G_Q$ , which are commonly found in various applications, such as crowdsourcing travel, social network based e-commerce, and study group selection, etc. In this paper, we first conceptually extend Bounded Simulation to Multi-Constrained Simulation (MCS) , and propose a novel NP-Complete Multi-Constrained Graph Pattern Matching (MC-GPM) problem. Then, to address the efficiency issue in large-scale MC-GPM, we propose a new concept called Strong Social Component (SSC), consisting of participants with strong social connections. We also propose an approach to identifying SSCs, and propose a novel index method and a graph compression method for SSC. Moreover, we devise a multithreading heuristic algorithm, called M-HAMC, to bidirectionally search the MC-GPM results in parallel without decompressing graphs. An extensive empirical study over five real-world large-scale social graphs has demonstrated the effectiveness and efficiency of our approach.
Read full abstract