Abstract
Social networks content analysis has become more challenging over the years due to the rapidly increasing amount of data. Real social networks are omnipresent in everyday life, which makes the structure of the generated data more complex. A key task in social networks analysis is to reduce the network’ s size and to produce an approximate representation that preserves the original network’ s properties. This task is known as graph’ s reduction and is gaining increasing attention in the scientific community. A review of literature reveals diverse methods to address this task. Some of them are based on graph coarsening and are developed to cope with the problem of communities detection. Others are part of graph sampling and are designed to reduce the graph’ s size while preserving its structure, which is our purpose. In this paper, we put forth a distributed model called DGS ”Distributed Graph Sampling” to generate a sample in a distributed way. The idea behind distributing our model is to cope with large scale social networks. In effect, our model is based on the MapReduce framework that allows to access simultaneously to several data segments for the calculation during the sampling strategy. The main task of our model is to use a new centrality measure based on the degree centrality to sample the graph. We evaluate the performance and the scalability of our DGS model using real world social networks. In this paper, we will compare our proposed model to four well-known sampling strategies in order to demonstrate its efficiency to preserve the original network’ s structure.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.