Network completion by leveraging similarity of nodes

Rana Forsati,Iman Barjasteh,Dennis Ross,Hayder Radha,Abdol-Hossein Esfahanian

doi:10.1007/s13278-016-0405-2

Abstract

The analysis of social networks has attracted much attention in recent years. Link prediction is an important aspect of social network analysis and an area of key research within that is the network completion problem, where it is assumed that only a small sample of a network (e.g., a complete or partially observed subgraph of a social graph) is observed and we would like to infer the unobserved part of the network. In a typical network completion problem the standard methods, such as matrix completion, are inapplicable due the nonuniform sampling of observed links. This paper investigates the network completion problem and demonstrates that by effectively leveraging the side information about the nodes (such as the pairwise similarity), it is possible to predict the unobserved part of the network with high accuracy. To this end, we propose an efficient algorithm that decouples the completion from transduction stage to effectively exploit the similarity information. This crucial difference greatly boosts the performance where appropriate similarity information is used. The recovery error of the proposed algorithm is analyzed theoretically based on the richness of the similarity information and the size of the observed subnetwork. To the best of our knowledge, this is the first algorithm that addresses the network completion with similarity of nodes with provable guarantees. Through extensive experiments on four real-world datasets, we demonstrate that (1) leveraging side information in matrix completion by decoupling the completion from transduction significantly improves the link prediction performance, (2) proposed two-stage method can deal with the cold-start problem that arises when a new entity enters the network, and (3) our approach is scalable to large-scale networks.

Full Text