Abstract

Link prediction in complex networks is to estimate the likelihood of two nodes to interact with each other in the future. As this problem has applications in a large number of real systems, many link prediction methods have been proposed. However, the validation of these methods is so far mainly conducted in the assumed noise-free networks. Therefore, we still miss a clear understanding of how the prediction results would be affected if the observed network data is no longer accurate. In this paper, we comprehensively study the robustness of the existing link prediction algorithms in the real networks where some links are missing, fake or swapped with other links. We find that missing links are more destructive than fake and swapped links for prediction accuracy. An index is proposed to quantify the robustness of the link prediction methods. Among the twenty-two studied link prediction methods, we find that though some methods have low prediction accuracy, they tend to perform reliably in the “noisy” environment.

Highlights

  • The increasing availability of data has helped us largely deepen our understanding of many real systems[1,2,3,4,5,6], as well as make predictions[4]

  • The results in the main paper will be based on three representative ones: Common Neighbors (CN)[30], Jaccard[30] and Resource Allocation (RA)[31]

  • Jaccard can reduce the bias of CN to large degree nodes, and RA is one of the best performing link prediction algorithms in accuracy

Read more

Summary

Introduction

The increasing availability of data has helped us largely deepen our understanding of many real systems[1,2,3,4,5,6], as well as make predictions[4]. We investigate the robustness of the existing link prediction algorithms in the real networks where some links are missing, fake or swapped with other links. Both random noise and biased noise in the observed link data are considered. In order to quantify and compare the robustness of different link prediction algorithms, an index is proposed in this paper It computes the area under the prediction accuracy curve with different fraction of noisy data. By using this robustness index, we find that though some methods have low prediction accuracy, they tend to perform reliably in the “noisy” environment. This new idea may inspire the design of some new link prediction methods with high performance in both aspects

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.