Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters

Jure Leskovec,Michael W Mahoney,Kevin J Lang,Anirban Dasgupta

doi:10.1080/15427951.2009.10129177

Abstract

A large body of work has been devoted to defining and identifying clusters or communities in social and information networks, i.e., in graphs in which the nodes represent underlying social entities and the edges represent some sort of interaction between pairs of nodes. Most such research begins with the premise that a community or a cluster should be thought of as a set of nodes that has more and/or better connections between its members than to the remainder of the network. In this paper, we explore from a novel perspective several questions related to identifying meaningful communities in large social and information networks, and we come to several striking conclusions. Rather than defining a procedure to extract sets of nodes from a graph and then attempting to interpret these sets as "real" communities, we employ approximation algorithms for the graph-partitioning problem to characterize as a function of size the statistical and structural properties of partitions of graphs that could plausibly be interpreted as communities. In particular, we define the _network community profile plot_, which characterizes the "best" possible community—according to the conductance measure—over a wide range of size scales. We study over one hundred large real-world networks, ranging from traditional and online social networks, to technological and information networks and web graphs, and ranging in size from thousands up to tens of millions of nodes. Our results suggest a significantly more refined picture of community structure in large networks than has been appreciated previously. Our observations agree with previous work on small networks, but we show that large networks have a very different structure. In particular, we observe tight communities that are barely connected to the rest of the network at very small size scales (up to ≈ 100 nodes); and communities of size scale beyond ≈ 100 nodes gradually "blend into" the expander-like core of the network and thus become less "community-like," with a roughly inverse relationship between community size and optimal community quality. This observation agrees well with the so-called Dunbar number, which gives a limit to the size of a well-functioning community. However, this behavior is not explained, even at a qualitative level, by any of the commonly used network-generation models. Moreover, it is exactly the opposite of what one would expect based on intuition from expander graphs, low-dimensional or manifold-like graphs, and from small social networks that have served as test beds of community-detection algorithms. The relatively gradual increase of the network community profile plot as a function of increasing community size depends in a subtle manner on the way in which local clustering information is propagated from smaller to larger size scales in the network. We have found that a generative graph model, in which new edges are added via an iterative "forest fire" burning process, is able to produce graphs exhibiting a network community profile plot similar to what we observe in our network data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters

Abstract

Talk to us

Similar Papers

More From: Internet Mathematics

Lead the way for us

Journal: Internet Mathematics	Publication Date: Jan 1, 2009
Citations: 1765

Similar Papers

Networks, communities and kronecker products
Jure Leskovec
-
Jure LeskovecJure Leskovec
06 Nov 2009
06 Nov 2009

Statistical properties of community structure in large social and information networks
Jure Leskovec ... Anirban Dasgupta
-
Jure Leskovec, et. al.Jure Leskovec ... Anirban Dasgupta
21 Apr 2008
21 Apr 2008

Exploring Local Community Structures in Large Networks
Feng Luo ... James Wang
-
Feng Luo, et. al.Feng Luo ... James Wang
01 Dec 2006
01 Dec 2006

Exploring local community structures in large networks
Feng Luo ... Eric Promislow
Web Intelligence and Agent Systems: An International Journal | VOL. 6
Feng Luo, et. al.Feng Luo ... Eric Promislow
01 Jan 2008
Web Intelligence and Agent Systems: An International Journal | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters

Abstract

Talk to us

Similar Papers

More From: Internet Mathematics