Abstract

Abstract In this paper, the Monte Carlo simulation method is used to investigate a generalized random walk model based on node2vec which is a popular algorithm in network embedding and has been widely applied in various domains such as link prediction, node classification, recommendation systems, etc. The aim is to quantitatively study the impact of the random walk parameters(including the number of walks per initial node r, the length of each walk l, the return parameter α, the common neighbor parameter β, and the outgoing parameter γ) on the embedding results. Specifically, the cross entropy is utilized as an observation to compare the difference between the frequency of nodes after random walks and the normalized degree sequence of nodes. The results show that the clustering coefficient significantly impacts the cross entropy. For networks with high clustering coefficient, the value of β should closely approximate that of γ, whereas for networks with low clustering coefficient, the value of β should be significantly smaller than that of γ. The value of α should be less than or equal to the minimum values between β and γ. Finally, the embedding effects of different random walk parameters are tested using node classification and link prediction tasks in real-world networks, and the results indicate that cross entropy can provide guidance for obtaining high-quality node embedding.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call