Abstract

Influence maximization is the problem of finding the set of nodes of a network that maximizes the size of the outbreak of a spreading process occurring on the network. Solutions to this problem are important for strategic decisions in marketing and political campaigns. The typical setting consists in the identification of small sets of initial spreaders in very large networks. This setting makes the optimization problem computationally infeasible for standard greedy optimization algorithms that account simultaneously for information about network topology and spreading dynamics, leaving space only to heuristic methods based on the drastic approximation of relying on the geometry of the network alone. The literature on the subject is plenty of purely topological methods for the identification of influential spreaders in networks. However, it is unclear how far these methods are from being optimal. Here, we perform a systematic test of the performance of a multitude of heuristic methods for the identification of influential spreaders. We quantify the performance of the various methods on a corpus of 100 real-world networks; the corpus consists of networks small enough for the application of greedy optimization so that results from this algorithm are used as the baseline needed for the analysis of the performance of the other methods on the same corpus of networks. We find that relatively simple network metrics, such as adaptive degree or closeness centralities, are able to achieve performances very close to the baseline value, thus providing good support for the use of these metrics in large-scale problem settings. Also, we show that a further 2–5% improvement towards the baseline performance is achievable by hybrid algorithms that combine two or more topological metrics together. This final result is validated on a small collection of large graphs where greedy optimization is not applicable.

Highlights

  • Influence maximization is the problem of finding the set of nodes of a network that maximizes the size of the outbreak of a spreading process occurring on the network

  • Armed with the metrics defined in the section above, we test the various methods for the identification of influential spreaders for Independent Cascade Model (ICM) dynamics over the entire corpus of real networks at our disposal

  • We focus our attention on ICM dynamics around the wcroitrikcainl tthhreesdhaotaldbapse=

Read more

Summary

Introduction

Influence maximization is the problem of finding the set of nodes of a network that maximizes the size of the outbreak of a spreading process occurring on the network. After the seminal work by Kempe et al, other similar greedy techniques for approximating solutions to the influence maximization problem have been proposed[11,12,13,14] As all these algorithms require knowledge of the model at the basis of the spreading process, often obtained through numerical simulations, they all suffer from the limitation of being applicable to small-medium sized networks only. We show that one way to achieve better performances is relying on hybrid methods that combine two or more centrality metrics together We validate this final result on a small set of large-scale networks

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call