Abstract

In this paper, we study the impact of data complexity and data quality in the overlapping community detection problem. We show that community detection algorithms are very unstable against incomplete or erroneous data, and this result is consistent with all the evaluated performance metrics. We verify it using three quality metrics (F1, NMI, and Omega) when the ground-truth community structure is known, in four very popular and representative detection algorithms: Order Statistics Local Optimization Method (OSLOM), Greedy Clique Expansion (GCE) algorithm, Speaker-listener Label Propagation Algorithm (SLPA), and Cluster Affiliation Model for Big Networks (BIG-CLAM). We evaluate it over a set of real instances that arise from detecting the courses that belong to different careers (degrees) of an engineering University, and over large benchmark sets of synthetic instances frequently used in the literature.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call