The Effects of Centrality Ordering in Label Propagation for Community Detection

Brian Dickinson,Wei Hu

doi:10.4236/sn.2015.44012

Abstract

In many cases randomness in community detection algorithms has been avoided due to issues with stability. Indeed replacing random ordering with centrality rankings has improved the performance of some techniques such as Label Propagation Algorithms. This study evaluates the effects of such orderings on the Speaker-listener Label Propagation Algorithm or SLPA, a modification of LPA which has already been stabilized through alternate means. This study demonstrates that in cases where stability has been achieved without eliminating randomness, the result of removing random ordering is over fitting and bias. The results of testing seven various measures of centrality in conjunction with SLPA across five social network graphs indicate that while certain measures outperform random orderings on certain graphs, random orderings have the highest overall accuracy. This is particularly true when strict orderings are used in each run. These results indicate that the more evenly distributed solution space which results from complete random ordering is more valuable than the more targeted search that results from centrality orderings.

Highlights

Many real world systems and networks can be represented by graphs of edges and nodes
SLPA was implemented in Java and received centrality values from the built in centrality functions included in the igraph package of R
In order to determine how quickly SLPA converged on an accurate community partition for each graph, SLPA was run with a varying iterations parameter from five to one-hundred

Summary

Introduction

Many real world systems and networks can be represented by graphs of edges and nodes. These systems include such diverse areas of study as social networks, html structure, and highway systems. One machine learning task which is often performed on these graphs is community detection in which algorithms attempt to find groups of nodes which have a significant difference in density between intragroup edges and intergroup edges, otherwise known as communities. These communities often provide some useful information about the elements represented by the nodes of a graph. Communities in an html graph might represent pages on the same domain or the same

Objectives

Results

Conclusion