Abstract

Time-stamped data are increasingly available for many social, economic, and information systems that can be represented as networks growing with time. The World Wide Web, social contact networks, and citation networks of scientific papers and online news articles, for example, are of this kind. Static methods can be inadequate for the analysis of growing networks as they miss essential information on the system’s dynamics. At the same time, time-aware methods require the choice of an observation timescale, yet we lack principled ways to determine it. We focus on the popular community detection problem which aims to partition a network’s nodes into meaningful groups. We use a multi-layer quality function to show, on both synthetic and real datasets, that the observation timescale that leads to optimal communities is tightly related to the system’s intrinsic aging timescale that can be inferred from the time-stamped network data. The use of temporal information leads to drastically different conclusions on the community structure of real information networks, which challenges the current understanding of the large-scale organization of growing networks. Our findings indicate that before attempting to assess structural patterns of evolving networks, it is vital to uncover the timescales of the dynamical processes that generated them.

Highlights

  • Many systems that are of interest for social science, information science, and data mining can be represented as complex networks that are not static but grow with time

  • Extensive research has shown that the inclusion of temporal information into network analysis has a dramatic impact on long-studied problems such as community detection [9,10,11], node ranking [11,12,13], dynamics control [14], and spreading phenomena [15,16,17]

  • To resolve the limitations of static modularity, we propose the temporal modularity quality function building on the recently-introduced Dynamic Configuration Model (DCM) for growing networks [41] which proposes a way of randomizing time-stamped networks whilst approximately preserving the time evolution of each node’s degree

Read more

Summary

INTRODUCTION

Many systems that are of interest for social science, information science, and data mining can be represented as complex networks that are not static but grow with time. When we wish to apply a multi-layer approach to identify relevant communities in growing networks, we face an impasse: Existing works assume layered input data [31,32,33,34] and they do not consider the question of how to divide an arbitrary time-stamped network into layers Addressing this question requires to choose an appropriate observation timescale, i.e., the temporal duration for each layer [5, 35, 36]. We derive analytically a criterion to estimate when a time-aggregated, static view of a growing network ceases to be sufficient for effective community detection through standard modularity maximization When this criterion is not met, the detected communities are strongly determined by node age and in discordance with the network’s actual community structure. Beyond the particular problem of community detection, the connection between the observation timescale τO used for structural analysis and the system’s intrinsic timescale τS is relevant to the general problem of analyzing the structure and function of the broad variety of networks that evolve in time

IMPACT OF NETWORK GROWTH ON MODULARITY
Modularity
Breakdown of static modularity in growing networks
COMMUNITY DETECTION IN GROWING NETWORKS
Multi-layer modularity
The optimal timescale of temporal modularity
Link-based and similarity-based timescale detection: A comparative analysis
SIGNIFICANCE ANALYSIS
IMPLICATIONS FOR REAL NETWORKS
DISCUSSION
ADDITIONAL RESULTS ON MODEL DATA
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call