Abstract
Many link prediction methods have been developed to infer unobserved links or predict latent links based on the observed network structure. However, due to network noises and irregular links in real network, the performances of existed methods are usually limited. Considering random noises and irregular links, we propose a perturbation-based framework based on Non-negative Matrix Factorization to predict missing links. We first automatically determine the suitable number of latent features, which is inner rank in NMF, by Colibri method. Then, we perturb training set of a network by perturbation sets many times and get a series of perturbed networks. Finally, the common basis matrix and coefficients matrix of these perturbed networks are obtained via NMF and form similarity matrix of the network for link prediction. Experimental results on fifteen real networks show that the proposed framework has competitive performances compared with state-of-the-art link prediction methods. Correlations between the performances of different methods and the statistics of networks show that those methods with good precisions have similar consistence.
Highlights
Many link prediction methods have been developed to infer unobserved links or predict latent links based on the observed network structure
Lots of real world systems can be represented as complex networks, where the entities become nodes and interacting entities are connected by edges
Common Neighbours (CN) index is defined as the number of common neighbours of the two nodes in the networks[9], Jaccard index is defined as the number of common neighbours of two nodes divided by interaction set of their degrees[10], Katz index is based on the ensemble of all paths between each node pair
Summary
Many link prediction methods have been developed to infer unobserved links or predict latent links based on the observed network structure. Link prediction estimates the probability of a link between two nodes based on the network structure[3]. It is a fundamental problem to demonstrate whether there is a link between two nodes, which usually cost too much to do laboratorial experiments It may largely reduce the experimental costs if we can infer the unobserved links based on the observed links with a certain prediction precision. There are two main classes of link prediction methods: similarity-based algorithms and probabilistic models[8]. Constructing the generating model of complex networks, link prediction becomes a problem of parameter learning in the model, the probability of the missing links can be predicted by the learned model[13].
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have