Cyber threat intelligence (CTI) researchers strive to uncover collaborations and emerging techniques within hacker networks. This study proposes an empirical approach to detect communities within hacker forums for CTI purposes. Eighteen algorithms are systematically evaluated, including state-of-the-art and benchmark methods for identifying overlapping and disjoint groups. Using discussions from five prominent English hacker forums, a comparative analysis examines the influence of the algorithms’ theoretical foundations on community detection. Since ground truths are unattainable for such networks, the study utilizes a multi-metric strategy, incorporating modularity, coverage, performance, and a newly introduced quality measure, Triplet Hub Potential, which quantifies the presence of influential hubs. The findings reveal that while modularity optimization algorithms such as Leiden and Louvain deliver consistent results, neighbor-based expanding techniques tend to provide superior performance. In particular, the Expansion algorithm stood out by uncovering granular hierarchical community structures. The ability to investigate these intimacies is helpful for CTI researchers. Ultimately, we suggest an approach to investigate hacker forums using community detection methods and encourage the future development of algorithms tailored to expose nuances within hacker networks.
Read full abstract