Abstract

Recently, community search over graphs has gained significant interest. In applications such as analysis of protein-protein interaction (PPI) networks, citation graphs, and collaboration networks, nodes tend to have attributes. Unfortunately, most previous community search algorithms ignore attributes and result in communities with poor cohesion w.r.t. their node attributes. In this paper, we study the problem of attribute-driven community search, that is, given an undirected graph G where nodes are associated with attributes, and an input query Q consisting of nodes V q and attributes W q , find the communities containing V q , in which most community members are densely inter-connected and have similar attributes. We formulate this problem as finding attributed truss communities (ATC), i.e., finding connected and close k-truss subgraphs containing V q , with the largest attribute relevance score. We design a framework of desirable properties that good score function should satisfy. We show that the problem is NP-hard. However, we develop an efficient greedy algorithmic framework to iteratively remove nodes with the least popular attributes, and shrink the graph into an ATC. In addition, we also build an elegant index to maintain k -truss structure and attribute information, and propose efficient query processing algorithms. Extensive experiments on large real-world networks with ground-truth communities show that our algorithms significantly outperform the state of the art and demonstrates their efficiency and effectiveness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call