Abstract

In this paper, we have proposed a novel overlapping community detection algorithm based on an ensemble approach with a distributed neighbourhood threshold method (EnDNTM). EnDNTM uses pre-partitioned disjoint communities generated by the ensemble mechanism and then analyzes the neighbourhood distribution of boundary nodes in disjoint communities to detect overlapping communities. It is a form of seed-based global method since boundary nodes are considered as seeds and become the starting point for detecting overlapping communities. A threshold value for each boundary node is used as the minimum influence by the neighbours of a node in order to determine its belongingness to any community. The effectiveness of the EnDNTM algorithm has been demonstrated by testing with five synthetic benchmark datasets and fifteen real-world datasets. The performance of the EnDNTM algorithm was compared with seven overlapping community detection algorithms. The F1-score, normalized mutual information ONMI and extended modularity Qo⁢v metrics were used to measure the quality of the detected communities. EnDNTM outperforms comparable algorithms on 4 out of 5 synthetic benchmarks datasets, 11 out of 15 real world datasets and gives comparable results with the remaining datasets. Experiments on various synthetic and real world datasets reveal that for a majority of datasets, the proposed ensemble-based distributed neighbourhood threshold method is able to select the best disjoint clusters produced by a disjoint method from a collection of methods for detecting overlapping communities.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call