Hypergraph clustering by iteratively reweighted modularity maximization

Tarun Kumar,Sankaran Vaidyanathan,Balaraman Ravindran,Srinivasan Parthasarathy,Harini Ananthapadmanabhan

doi:10.1007/s41109-020-00300-3

Tarun Kumar, Sankaran Vaidyanathan + Show 3 more

Open Access

PDF Available

https://doi.org/10.1007/s41109-020-00300-3

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Learning on graphs is a subject of great interest due to the abundance of relational data from real-world systems. Many of these systems involve higher-order interactions (super-dyadic) rather than mere pairwise (dyadic) relationships; examples of these are co-authorship, co-citation, and metabolic reaction networks. Such super-dyadic relations are more adequately modeled using hypergraphs rather than graphs. Learning on hypergraphs has thus been garnering increased attention with potential applications in network analysis, VLSI design, and computer vision, among others. Especially, hypergraph clustering is gaining attention because of its enormous applications such as component placement in VLSI, group discovery in bibliographic systems, image segmentation in CV, etc. For the problem of clustering on graphs, modularity maximization has been known to work well in the pairwise setting. Our primary contribution in this article is to provide a generalization of the modularity maximization framework for clustering on hypergraphs. In doing so, we introduce a null model for graphs generated by hypergraph reduction and prove its equivalence to the configuration model for undirected graphs. The proposed graph reduction technique preserves the node degree sequence from the original hypergraph. The modularity function can be defined on a thus reduced graph, which can be maximized using any standard modularity maximization method, such as the Louvain method. We additionally propose an iterative technique that provides refinement over the obtained clusters. We demonstrate both the efficacy and efficiency of our methods on several real-world datasets.

Highlights

The graph clustering problem involves dividing a graph into multiple sets of nodes, such that the similarity of nodes within a cluster is higher than the similarity of nodes belonging to different clusters (Schaeffer 2007; Sankar et al 2015; Wang et al 2017; Satuluri and Parthasarathy 2009)
(“Results and analysis” section): We examine the scalability of the hypergraph modularity maximization algorithm using synthetic data
Besides the fact that Iteratively Reweighted Modularity Maximization (IRMM) significantly outperforms over other methods, we want to emphasize on the following two observations: Superior performance of hypergraph based methods: It is evident that hypergraph based methods perform consistently better than their clique based equivalents

Summary

Introduction

The graph clustering problem involves dividing a graph into multiple sets of nodes, such that the similarity of nodes within a cluster is higher than the similarity of nodes belonging to different clusters (Schaeffer 2007; Sankar et al 2015; Wang et al 2017; Satuluri and Parthasarathy 2009). The representational power of pairwise graph models (2020) 5:52 is insufficient to capture higher-order information and present it for analysis or learning tasks. A hyperedge can capture a multi-way relation; for example, in a co-authorship network, where nodes represent authors, a hyperedge could represent a group of authors who collaborated for a common paper If this were modeled as a graph, we would be able to see which two authors are collaborating, but would not see if multiple authors worked on the same paper. This suggests that the hypergraph representation is more information-rich but is conducive to higher-order learning tasks by virtue of its structure.

Methods

Results

Conclusion