Abstract

Graph clustering is one of the most important research topics in graph mining and network analysis. Given the abundance of data in many real-world applications, graph nodes and edges could be annotated with multiple sets of attributes that could be derived from heterogeneous data sources. The consideration of these attributes during graph clustering would facilitate the generation of graph clusters with balanced and cohesive intra-cluster structures and nodes with homogeneous properties. In this paper, we propose a graph clustering approach for mining skyline clusters over large attributed graphs based on the dominance relationship. Each skyline solution is optimized simultaneously for multiple fitness functions, each function is defined over the graph topology or over a particular set of attributes derived from multiple data sources. We evaluate our approach experimentally with a large protein-protein interaction network of the human interactome enriched with large sets of heterogeneous cancer-associated attributes. The results demonstrate the efficiency of our approach and show how integrating node attributes from multiple data sources can result in a more robust graph clustering than the consideration of the graph topology alone.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call