Abstract

The problem of identifying the influential spreaders - the important nodes - in a real world network is of high importance due to its theoretical interest as well as its practical applications, such as the acceleration of information diffusion, the control of the spread of a disease and the improvement of the resilience of networks to external attacks. In this paper, we propose a graph exploration sampling method that accurately identifies the influential spreaders in a complex network, without any prior knowledge of the original graph, apart from the collected samples/subgraphs. The method explores the graph, following a deterministic selection rule and outputs a graph sample - the set of edges that have been crossed. The proposed method is based on a version of Rank Degree graph sampling algorithm. We conduct extensive experiments in eight real world networks by simulating the susceptible-infected-recovered (SIR) and susceptible-infected-susceptible (SIS) epidemic models which serve as ground truth identifiers of nodes spreading efficiency. Experimentally, we show that by exploring only the 20% of the network and using the degree centrality as well as the k-core measure, we are able to identify the influential spreaders with at least the same accuracy as in the full information case, namely, the case where we have access to the original graph and in that graph, we compute the centrality measures. Finally and more importantly, we present strong evidence that the degree centrality - the degree of nodes in the collected samples - is almost as accurate as the k-core values obtained from the original graph.

Highlights

  • Understanding spreading process in real world complex networks is a central subject in network analysis, due to the variety of applications which occur - such as the control of the spread of a disease, the viral marketing, as well as the network vulnerability to external attacks

  • A first preliminary study which investigates the applications of graph sampling to the influential spreaders identification problem has been conducted in Salamanos et al (2016), where we studied the effectiveness of Rank Degree as influential spreaders identifier

  • We present strong evidence that the degree centrality - the degree of nodes in the collected samples - is almost as accurate as the k-core measure computed in the original graph

Read more

Summary

Introduction

Understanding spreading process in real world complex networks is a central subject in network analysis, due to the variety of applications which occur - such as the control of the spread of a disease, the viral marketing, as well as the network vulnerability to external attacks. Key role in these processes play the high spreading efficient nodes which are often called influential spreaders, representing the nodes that are more likely to spread information or a virus in a large part of the network. Highly connected nodes are not always the best spreaders, while less connected nodes but, at the same time, well connected with the core of the network may strongly affect the spreading process

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.