Abstract

As the rapid development of social media, social community structure mining has become a popular research field in recent years. But traditional social community mining methods are not able to effectively deal with the data of large scale networks. We firstly introduce an information compression based community mining model in this paper, and with the help of the model, we transform the community mining problem into optimal information coding problem. And then propose a parallel computing method CInfoMR based on the MapReduce parallel framework to mine the social community structure. In the InfoMR, map tasks are responsible for splitting network data into a plenty of subsets, each reduce task is responsible for accomplishing community clustering by means of loop iteration on its subset, and finally all the results from the reduce phase are merged together to output. Theoretical analysis and related experiments verify the validity of the work in this paper. The results of the accuracy experiments show that, the accuracy of the InfoMR is much higher than that of Fast GN and PDST algorithm. The performance experiments on 2 real dataset and 2 simulative dataset show that InfoMR is able to accomplish the task of mining social community in a relatively short period of time on big data social networks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call