Abstract

Community detection is a commonly used technique for identifying groups in a network based on similarities in connectivity patterns. To facilitate community detection in large networks, we recast the network as a smaller network of ‘super nodes’, where each super node comprises one or more nodes of the original network. We can then use this super node representation as the input into standard community detection algorithms. To define the seeds, or centers, of our super nodes, we apply the ‘CoreHD’ ranking, a technique applied in network dismantling and decycling problems. We test our approach through the analysis of two common methods for community detection: modularity maximization with the Louvain algorithm and maximum likelihood optimization for fitting a stochastic block model. Our results highlight that applying community detection to the compressed network of super nodes is significantly faster while successfully producing partitions that are more aligned with the local network connectivity and more stable across multiple (stochastic) runs within and between community detection algorithms, yet still overlap well with the results obtained using the full network.

Highlights

  • Networks appear across disciplines as natural data structures for modeling relational definitions between entities, such as regulatory interactions between genes and proteins, and social connections between people

  • Since the super node representation produces a weighted network, where the edge weights are counts computed based on the original network, both of these community detection algorithms are able to accommodate these kinds of edge weights

  • Creating a super node representation of the network with S = 600 leads to the maximum value of Normalized Mutual Information (NMI)(zFull, zSN) among all values of S we considered in our experiments

Read more

Summary

Introduction

Networks appear across disciplines as natural data structures for modeling relational definitions between entities, such as regulatory interactions between genes and proteins, and social connections between people. To speed up segmentation for large images, a popular approach is to avoid computing segmentations at the pixel level and instead reformulate the segmentation problem based on larger-scale image primitives that are likely part of the same partition This can be accomplished by super pixels that aggregate pixels together in a way that faithfully adheres to image boundaries, maintaining or improving segmentation accuracy[12]. Various authors have typically based the quality of their super pixel representation of the original image on two criteria They seek to minimize under segmentation error[13], which quantifies the extent to which the super pixels bleed across original boundaries in the image. For partitions zFull and zSN with p and q communities, respectively, with N the R × C contingency table matrix where Nij gives the count of the number of shared nodes in community i in zFull and community j in zSN, the NMI between the two partitions is NMI(z Full , zSN)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call