Abstract
Large-scale network mining and analysis is key to revealing the underlying dynamics of networks, not easily observable before. Lately, there is a fast-growing interest in learning low-dimensional continuous representations of networks that can be utilized to perform highly accurate and scalable graph mining tasks. A family of these methods is based on performing random walks on a network to learn its structural features and providing the sequence of random walks as input to a deep learning architecture to learn a network embedding. While these methods perform well, they can only operate on static networks. However, in real-world, networks are evolving, as nodes and edges are continuously added or deleted. As a result, any previously obtained network representation will now be outdated having an adverse effect on the accuracy of the network mining task at stake. The naive approach to address this problem is to re-apply the embedding method of choice every time there is an update to the network. But this approach has serious drawbacks. First, it is inefficient, because the embedding method itself is computationally expensive. Then, the network mining task outcome obtained by the subsequent network representations are not directly comparable to each other, due to the randomness involved in the new set of random walks involved each time. In this paper, we propose EvoNRL, a random-walk based method for learning representations of evolving networks. The key idea of our approach is to first obtain a set of random walks on the current state of network. Then, while changes occur in the evolving network’s topology, to dynamically update the random walks in reserve, so they do not introduce any bias. That way we are in position of utilizing the updated set of random walks to continuously learn accurate mappings from the evolving network to a low-dimension network representation. Moreover, we present an analytical method for determining the right time to obtain a new representation of the evolving network that balances accuracy and time performance. A thorough experimental evaluation is performed that demonstrates the effectiveness of our method against sensible baselines and varying conditions.
Highlights
Network science, built on the mathematics of graph theory, leverage network structures to model and analyze pairwise relationships between objects (Newman 2003).With a growing number of networks — social, technological, biological — becom-Heidari and Papagelis Applied Network Science (2020) 5:18 ing available and representing an ever increasing amount of information, the ability to and effectively perform large-scale network mining and analysis is key to revealing the underlying dynamics of these networks, not observable before
In the case of evolving networks, changes that occur in the network topology will not be interpretable in the changes observed in the network embedding
Removing nodes: As we described in “Evolving network representation learning” section node deletion can be treated as a special case of edge deletion
Summary
Built on the mathematics of graph theory, leverage network structures to model and analyze pairwise relationships between objects (or people) (Newman 2003).With a growing number of networks — social, technological, biological — becom-Heidari and Papagelis Applied Network Science (2020) 5:18 ing available and representing an ever increasing amount of information, the ability to and effectively perform large-scale network mining and analysis is key to revealing the underlying dynamics of these networks, not observable before. A family of these methods is based on performing random walks on a network. Randomwalk based methods, inspired by the word2vec’s skip-gram model of producing word embeddings (Mikolov et al 2013b), try to establish an analogy between a network and a document. While a document is an ordered sequence of words, a network can effectively be described by a set of random walks (i.e., ordered sequences of nodes). Typical examples of these algorithms include DeepWalk (Perozzi et al 2014) and node2vec (Grover and Leskovec 2016). (i) a set of random walks, say walks, is collected by performing r random walks of length l starting at each node in the network (typical values are r = 10, l = 80)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.