Abstract

Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of “word of mouth” in the promotion of new products. Motivated by the design of viral marketing strategies, Domingos and Richardson posed a fundamental algorithmic problem for such social network processes: if we can try to convince a subset of individuals to adopt a new product or innovation, and the goal is to trigger a large cascade of further adoptions, which set of individuals should we target? We consider this problem in several of the most widely studied models in social network analysis. The optimization problem of selecting the most influential nodes is NP-hard here. The two conference papers upon which this article is based (KDD 2003 and ICALP 2005) provide the first provable approximation guarantees for efficient algorithms. Using an analysis framework based on submodular functions, we show that a natural greedy strategy obtains a solution that is provably within 63% of optimal for several classes of models; our framework suggests a general approach for reasoning about the performance guarantees of algorithms for these types of influence problems in social networks. We also provide computational experiments on large collaboration networks, showing that in addition to their provable guarantees, our approximation algorithms significantly out-perform node-selection heuristics based on the well-studied notions of degree centrality and distance centrality from the field of social networks. The present article is an expanded version of two conference papers which appeared in KDD 2003 and ICALP 2005, respectively.

Highlights

  • A social network—the graph of relationships and interactions within a group of individuals—plays a fundamental role as a medium for the spread of information, ideas, and influence among its members

  • We provide computational experiments on large collaboration networks, showing that in addition to their provable guarantees, our approximation algorithms significantly out-perform node-selection heuristics based on the well-studied notions of degree centrality and distance centrality from the field of social networks

  • We define the concrete classes of models for the diffusion of innovations in Section 2 below, departing somewhat from the Domingos-Richardson framework: where their models are essentially descriptive, specifying a joint distribution over all nodes’ behavior in a global sense, we focus on more operational models from mathematical sociology [41, 74] and interacting particle systems [33, 32, 28, 56] that explicitly represent the step-by-step dynamics of adoption

Read more

Summary

Introduction

A social network—the graph of relationships and interactions within a group of individuals—plays a fundamental role as a medium for the spread of information, ideas, and influence among its members. If we want to understand the extent to which such ideas are adopted, it can be important to understand how the dynamics of adoption are likely to unfold within the underlying social network: the extent to which people are likely to be affected by decisions of their friends and colleagues, or the extent to which “word-of-mouth” effects will take hold Such network diffusion processes have a long history of study in the social sciences. We can formally express the Domingos-Richardson style of optimization problem—choosing a good initial set of nodes to target—as follows: the algorithm chooses an initial set A0 = A of active nodes that start the diffusion process. We will show below that for the models we consider, it is NP-hard to determine the optimum set for influence maximization

Our results
Subsequent work
Models for the diffusion of an innovation
The Threshold model
A General Threshold model
The Submodular Threshold model
The Cascade model
A General Cascade model
The Decreasing Cascade model
Node weights
Equivalence of models and hardness
Decreasing Cascade model and Submodular Threshold model
Inapproximability results and discussion
Approximation algorithm and analysis
The triggering set technique
Independent Cascade
Linear Threshold
Other models
Relationship to the Triggering Model
Proof of submodularity
Non-progressive processes
General marketing strategies
Experiments
The network data
The influence models
The algorithms and implementation
The results
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call