Abstract

With the development of Internet, online social networks and websites generate a large amount of data. At the same time, several distributed systems, represented by Hadoop, has been proposed to handle mass data. These systems provide both efficient and convenient way to construct different kinds of algorithms. Community detection, a traditional research area, is now facing the challenge of Big Data. Draw support from a powerful distributed graph processing system, Graph Lab, we redesign and implement several classical community detection algorithms using very large real-life datasets. Using node similarity parameter Adj Page Sim, we propose a new community detection algorithm based on label propagation, namely NSLPA. Experiments and benchmarks reveal that several quite powerful algorithms perform bad in distributed environments. However, NSLPA is not only faster but more accurate compared with other community detection algorithms. NSLPA can process a graph with 60 million nodes and 2 billion edges in less than 1000 seconds with a relatively high accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call