Abstract

Label Propagation, while more commonly known as a machine learning algorithm for classification, is also an effective method for detecting communities in networks. We propose a new Direction Optimizing Label Propagation Algorithm (DOLPA) that relies on the use of frontiers and alternates between label push and label pull operations to enhance the performance of the standard Label Propagation Algorithm (LPA). Specifically, DOLPA has parameters for tuning the processing order of vertices in a graph, which in turn reduces the number of edges visited and improves the quality of solution obtained. We apply DOLPA to the community detection problem, present the design and implementation of the algorithm, and discuss its shared-memory parallelization using OpenMP. Empirically, we evaluate our algorithm using synthetic graphs as well as real-world networks. Compared with the state-of-the-art Parallel Label Propagation algorithm, we achieve at least two times the F-Score while reducing the runtime by 50% for synthetic graphs with overlapping communities. We also compare DOLPA against state of the art parallel implementation of the Louvain method using the same graphs and show that DOLPA achieves about three times the F-Score at 10% the runtime.

Highlights

  • The label propagation algorithm (LPA) is a machine learning algorithm for data classification where label information is propagated from labeled to unlabeled entities within a network [42]

  • We evaluate the performance of our OpenMP Direction Optimizing Label Propagation Algorithm (DOLPA) implementation and the quality of solution produced with both synthetic graphs and real-world graphs, and show that, compared with Parallel Label Propagation (PLP), DOLPA achieves at least two times the F-Score while reducing the runtime by 50%

  • We explore the combination of the seeding parameter τ and the switch threshold ω so that the switch from push to pull in DOLPA happens at iteration ω

Read more

Summary

Introduction

The label propagation algorithm (LPA) is a machine learning algorithm for data classification where label information is propagated from labeled to unlabeled entities within a network [42]. Raghavan et al [28] showed that LPA could be an effective method for identifying communities in networks. As sizes of networks continue to increase dramatically, we generally need fast algorithms to enable large-scale real-time graph processing. Two of the main advantages of LPA over many other community detection algorithms are that its run time is nearly linear in the size of the network and that it requires no a priori information about community structures in a network. Both of these make it practical for graphs with billions of edges

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call