Link prediction in dynamic networks using random dot product graphs

Francesco Sanna Passino,Nicholas A. Heard,Anna S. Bertiger,Joshua C. Neil

doi:10.1007/s10618-021-00784-2

Francesco Sanna Passino, Nicholas A. Heard + Show 2 more

Open Access

https://doi.org/10.1007/s10618-021-00784-2

Copy DOI

Abstract

The problem of predicting links in large networks is an important task in a variety of practical applications, including social sciences, biology and computer security. In this paper, statistical techniques for link prediction based on the popular random dot product graph model are carefully presented, analysed and extended to dynamic settings. Motivated by a practical application in cyber-security, this paper demonstrates that random dot product graphs not only represent a powerful tool for inferring differences between multiple networks, but are also efficient for prediction purposes and for understanding the temporal evolution of the network. The probabilities of links are obtained by fusing information at two stages: spectral methods provide estimates of latent positions for each node, and time series models are used to capture temporal dynamics. In this way, traditional link prediction methods, usually based on decompositions of the entire network adjacency matrix, are extended using temporal information. The methods presented in this article are applied to a number of simulated and real-world graphs, showing promising results.

Highlights

Link prediction is defined as the task of predicting the presence of an edge between two nodes in a network, based on latent characteristics of the graph (Liben-Nowell and Kleinberg 2007)
The proposed methods were tested on synthetic data and on real world dynamic networks from different application domains: transportation systems, cyber-security, and co-authorship networks
Link prediction techniques based on random dot product graphs have been presented, discussed and compared

Summary

Introduction

Link prediction is defined as the task of predicting the presence of an edge between two nodes in a network, based on latent characteristics of the graph (Liben-Nowell and Kleinberg 2007). The discussion about link prediction is motivated by applications in cyber-security and computer network monitoring (Jeske et al 2018). The ability to correctly predict and associate anomaly scores with the connections in a network is valuable for the cyber-defence of enterprises. Adversaries may introduce changes in the structure of an enterprise network in the course of their attack. Predicting links in order to identify significant deviations in expected behaviour could lead to the detection of an otherwise extremely damaging network breach. It is necessary to correctly score new links

Objectives

Results

Discussion

Conclusion