Abstract

Local graph clustering is an important machine learning task that aims to find a well-connected cluster near a set of seed nodes. Recent results have revealed that incorporating higher order information significantly enhances the results of graph clustering techniques. The majority of existing research in this area focuses on spectral graph theory-based techniques. However, an alternative perspective on local graph clustering arises from using max-flow and min-cut on the objectives, which offer distinctly different guarantees. For instance, a new method called capacity releasing diffusion (CRD) was recently proposed and shown to preserve local structure around the seeds better than spectral methods. The method was also the first local clustering technique that is not subject to the quadratic Cheeger inequality by assuming a good cluster near the seed nodes. In this paper, we propose a local hypergraph clustering technique called hypergraph CRD (HG-CRD) by extending the CRD process to cluster based on higher order patterns, encoded as hyperedges of a hypergraph. Moreover, we theoretically show that HG-CRD gives results about a quantity called motif conductance, rather than a biased version used in previous experiments. Experimental results on synthetic datasets and real world graphs show that HG-CRD enhances the clustering quality.

Highlights

  • Graph and network mining techniques traditionally experience a variety of issues as they scale to larger data [1]

  • We present hypergraph CRD (HG-CRD), a hypergraph-based implementation of the capacity releasing diffusion hybrid algorithm that combines spectral-like diffusion with flow-like guarantees

  • We show that HG-CRD is the first local higher order graph clustering method that is not subject to the quadratic Cheeger inequality in section 4.6 by assuming the existence of a good cluster near the seed nodes

Read more

Summary

Introduction

Graph and network mining techniques traditionally experience a variety of issues as they scale to larger data [1]. One important class of methods that has a different set of trade-offs are local clustering algorithms [2] These methods seek to apply a graph mining, or clustering (in this case), procedure around a seed set of nodes, where we are only interesting in the output nearby the seeds. In this way, local clustering algorithms avoid the memory and time bottlenecks that other algorithms experience. If there is a good cluster nearby, defined in terms of motif-conductance, and that cluster is well connected internally, the HG-CRD algorithm will find it These experiments show HG-CRD offers new opportunities for reliable hypergraph-based community detection in a variety of scenarios

Related work
Hypergraph clustering and higher-order graph analysis
Local clustering
Key differences with our contribution
Local cluster quality
A Capacity releasing diffusion for hypergraphs via motif matrices
A true hypergraph CRD
Node and hyperedge variables in HG-CRD
The high-level algorithm
HG-CRD toy example
Motivating example
HG-CRD analysis
Running time and space discussion
HG-CRD extension for non-uniform hypergraphs
Experimental results
Synthetic dataset
Hypergraph-CRD compared to CRD
Related work comparison
Running time experiments
Robustness experiments
Large hyperedge experiments
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call