Abstract

With the explosive growth of the scale of complex networks, the existing community detection algorithms are unable to meet the needs of rapid analysis of the community structure in complex networks. A new algorithm for detecting communities in complex networks based on the Hadoop platform (called Community Detection on Hadoop (CDOH)) is proposed in this paper. Based on the basic idea of modularity increment, our algorithm implements parallel merging and accomplishes a fast and accurate detection of the community structure in complex networks. Our extensive experimental results on three real datasets of complex networks demonstrate that the CDOH algorithm can improve the efficiency of the current memory-based community detection algorithms significantly without affecting the accuracy of the community detection.

Highlights

  • In the era of Web 2.0, objects are connected to each other by various technologies such as the Internet and the Internet of Things, and form a variety of complex networks such as interpersonal interaction, essay reference, transportation, and protein interaction networks

  • We propose a new complex network community detection algorithm based on Hadoop framework (called Community Detection on Hadoop (CDOH))

  • In order to overcome the shortcomings of traditional community detection algorithms and parallel community detection algorithms, we propose a new complex network community detection algorithm based on Hadoop, which effectively implements a fast and accurate detection of complex network community structure

Read more

Summary

Introduction

In the era of Web 2.0, objects are connected to each other by various technologies such as the Internet and the Internet of Things, and form a variety of complex networks such as interpersonal interaction, essay reference, transportation, and protein interaction networks. The large-scale complex network data sets generated by such platforms have the characteristics of node diversity, complex structure, multi-complexity fusion, which challenges the accuracy of the traditional complex network community detection algorithms. Based on the idea of the maximum modularity, and combining the distributed characteristics of the Hadoop platform, a new modularity matrix update method is proposed and a corresponding community merging strategy is constructed to implement a fast and accurate detection and discovery of complex network community structures; We theoretically analyze our proposed CDOH algorithm, and show the computational cost of our algorithm can achieve O(n) computational cost when we use enough parallel nodes; Experimental results on 3 real datasets demonstrate that CDOH significantly outperforms the traditional complex network community detection algorithm in terms of both the efficiency and accuracy of the community detection of complex networks.

Related Works
Definitions
The CDOH Algorithm
Parameter Initialization
Find the Maximum Modularity Increment
Merging and Updating Communities
Generating Community Discovery Results
Datasets and Evaluation Algorithms
Analysis of Community Detection Accuracy
Analysis of Community Detection Efficiency
Conclusions
Future Works
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call