Abstract
Community is a very important structure in the Web. The discovery of these communities is a challenging task. In many researches, it is an effective way of exhaustively extracting dense sub graphs to find communities. The pioneer works in[1], [2] uses a CBG(Complete Bipartite Graph) as a signature of a community core and discovers many implicit communities. However, the CBG is too strict and it excludes many possible community structures. Therefore, instead of CBG, DBG(Dense Bipartite Graph) is chosen as a signature. For instance, Reddy et al. [3] proposed degree-based (a, s)density, Gibson et al. [4] and Dourisboure et al. [5] use a ratio-based ?-dense function to qualify the density of a DBG. In this paper, we analyze two previous density measurements and point out that in low density the structure of bipartite graph may be unreasonable because of the existence of cutting nodes. For this reason, we introduce DBGB(Dense Bipartite Graph Block). Subsequently, we employ two-step expansion to construct bipartite graph which decreases the number of unnecessary nodes and edges. In order to get optimal bipartite structure, we propose max DBGB and design an extracting algorithm. The new method is tested under 4 datasets collected by a Web crawler and dense cores have been extracted. We check 200 random sampling cores and 89 percent of them make sense. Meanwhile, we apply Dourisboure's method on one of the datasets with different scale and the cores extracted contain many cutting nodes. Consequently, the experiment results show that our method is effective.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have