Stability and Robustness in Influence Maximization

Xinran He,David Kempe

doi:10.1145/3233227

Abstract

In the well-studied Influence Maximization problem, the goal is to identify a set of k nodes in a social network whose joint influence on the network is maximized. A large body of recent work has justified research on Influence Maximization models and algorithms with their potential to create societal or economic value. However, in order to live up to this potential, the algorithms must be robust to large amounts of noise, for they require quantitative estimates of the influence, which individuals exert on each other; ground truth for such quantities is inaccessible, and even decent estimates are very difficult to obtain. We begin to address this concern formally. First, we exhibit simple inputs on which even very small estimation errors may mislead every algorithm into highly suboptimal solutions. Motivated by this observation, we propose the Perturbation Interval model as a framework to characterize the stability of Influence Maximization against noise in the inferred diffusion network. Analyzing the susceptibility of specific instances to estimation errors leads to a clean algorithmic question, which we term the Influence Difference Maximization problem. However, the objective function of Influence Difference Maximization is NP-hard to approximate within a factor of O ( n 1−ϵ) for any ϵ > 0. Given the infeasibility of diagnosing instability algorithmically, we focus on finding influential users robustly across multiple diffusion settings. We define a Robust Influence Maximization framework wherein an algorithm is presented with a set of influence functions. The algorithm’s goal is to identify a set of k nodes who are simultaneously influential for all influence functions, compared to the (function-specific) optimum solutions. We show strong approximation hardness results for this problem unless the algorithm gets to select at least a logarithmic factor more seeds than the optimum solution. However, when enough extra seeds may be selected, we show that techniques of Krause et al. can be used to approximate the optimum robust influence to within a factor of 1−1/ e . We evaluate this bicriteria approximation algorithm against natural heuristics on several real-world datasets. Our experiments indicate that the worst-case hardness does not necessarily translate into bad performance on real-world datasets; all algorithms perform fairly well.

Full Text