Abstract

Graph clustering is one of the most important research topics in graph mining and network analysis. With the abundance of data in many real-world applications, the graph nodes and edges could be annotated with multiple sets of attributes that could be derived from heterogeneous data sources. Considering these attributes during the graph clustering could help in generating graph clusters with balanced and cohesive intra-cluster structure and nodes having homogeneous properties. In this paper, we propose a genetic algorithm-based graph clustering approach for mining skyline clusters over large attributed graphs based on the dominance relationship. Each skyline solution is optimized with respect to multiple fitness functions simultaneously where each function is defined over the graph topology or over a particular set of attributes that are derived from multiple data sources. We experimentally evaluate our approach on a real-world large protein-protein interaction network of the human interactome enriched with large sets of heterogeneous cancer associated attributes. The obtained results show the efficiency of our approach and how integrating node attributes of multiple data sources allows to obtain a more robust graph clustering than by considering only the graph topology.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.