Abstract
Community structure discovery can help us better understand the capabilities and functions of the network. However, many existing methods have failed to identify nodes in communities accurately. In this paper, we proposed a heuristic community detection method based on node similarities that are computed by assigning different edge weight influence factors based on different neighbor types of nodes. Concretely, by arbitrarily choosing a pair of nodes, we firstly found out the common neighbor nodes of the node pair and their corresponding neighbor nodes. Then, different edge weight influence factors are assigned according to the impact of different types of neighbor nodes on node similarity. Finally, the similarities between a pair of nodes are calculated by the proportion of various edge weight influence factors related to the node pair. Along the direction, a hash table based data storage and retrieval strategy with a lower conflict rate is introduced to hash the edge information into a ternary bucket structure that can be merged according to the same starting node. This operation can reduce the time complexity of the data query to a constant level, and realize the parallel computing of node similarity. When obtaining similarity of node pair, we merged nodes into communities by a heuristic hierarchical clustering. And, the resulting community structure is detected until all node similarities are calculated. With the help of the comparison tests of different methods based on the benchmark networks that have ground-truth communities, the proposed method for community detection provides better performance in both identification accuracy and time efficiency.
Highlights
Nature is a complex system of mutual interaction and polymorphism
We analyze the potential reason that when calculating the node similarity, the common neighbors of the nodes and their corresponding neighbors are fully considered, so the acquisition of local topological information of each node pair is superior to other methods that do not fully consider such characteristics
An evaluating method of node similarity is introduced by assigning different edge weight influence factors based on the impact of different neighbor types of nodes on node similarity
Summary
Nature is a complex system of mutual interaction and polymorphism. Its commonality behaves relatively complicated internal structure that can be mapped into a nonlinear data structure similar to a graph (or network). The proposed rule can be extended to weighted network with overlapping community structure Based such measurement of node similarity, the parallel heuristic community detection method proposed in this paper can be applied to one-dimension model of network in most cases. 2) In the process of evaluating node similarity, we proposed a hash table based data storage and retrieval strategy with a lower conflict rate It realizes the parallel computing of node similarity to greatly improve the computational efficiency. The rest of the paper is organized as follows: firstly, the work related to our study is introduced in section II and the related research strategies about the proposed method are illustrated, including the node similarity criteria, the hash table based parallel computing of node similarity and the description of algorithm principal frame; the section IV shows the experimental results of the proposed method, including the metrics, material and algorithm evaluation; and the section V presents detailed discussions on the experimental results from detection accuracy and computational efficiency; in section VI, we concluded our work The rest of the paper is organized as follows: firstly, the work related to our study is introduced in section II and the related research strategies about the proposed method are illustrated in section III, including the node similarity criteria, the hash table based parallel computing of node similarity and the description of algorithm principal frame; the section IV shows the experimental results of the proposed method, including the metrics, material and algorithm evaluation; and the section V presents detailed discussions on the experimental results from detection accuracy and computational efficiency; in section VI, we concluded our work
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.