The omnipresence of information cascading process in mobile social networking applications makes the identification of a small set <inline-formula><tex-math notation="LaTeX">$S$</tex-math></inline-formula> of influential users, which is widely believed to trigger the information outbreak, always an crucial issue in various applications such as the mobile advertising and viral marketing. Formulated as Influence maximization (IM) in 2003, this NP-hard problem has received a multitude of studies with diverse angles. However, these works often unable to provide reliable solutions, due to the loss of an exact metric for evaluating users’ contributions on information cascading in the state-of-the-art sampling based IM schemes. In this paper, we evaluate users in IM based on the collective influence (CI), a metric on the structural features of the users in network graph that reflects the contributions of the users’ neighborhoods on shaping collective dynamics of the users over the whole network. For conducting the influencer identification under probabilistic diffusion model based on the CI, we specify a quantified structural feature of the most influential users from the scope of diffusion over the whole network, and reveal that the structural influence power (CI value) of each user is a weighted cumulation of the diffusion probabilities from neighbors within certain hops. Utilizing CI, we design a novel algorithm which identifies the influencers via iteratively choosing the users with top CI values. Moreover, we point out that directly computing CI values requires to traverse the network which is originally represented by a high-dimensional matrix, and leads to huge complexity of influencer identification. To improve scalability, we further trade precision for efficiency by incorporating network embedding, a dimensionality reduction technology for networks, into algorithm design, and propose a minor variant, where CI is jointly recapitulated by low-dimensional user representations and user degrees. The superiority of our algorithms is empirically validated over 8 datasets, with an increment in influence size up to 50 percent and a comparable or even less running time comparing with existing baselines.
Read full abstract