Discriminative Streaming Network Embedding

Yiyan Qi,Jiefeng Cheng,Xiaojun Chen,Reynold Cheng,Albert Bifet,Pinghui Wang

doi:10.1016/j.knosys.2019.105138

Abstract

Many real-world networks (e.g., friendship network among Facebook users) generate data (e.g., friend requests) in a stream fashion. Recently, several network embedding methods are proposed to learn embeddings on such networks incrementally. However, these methods perform incremental updates in a heuristic manner and thus fail to quantitatively restrict the differences between incremental learning and direct learning on the entire network (i.e., the batch learning). Moreover, they ignore the node labels (e.g., interests) when learning node embeddings, which undermines the performance of network embeddings for applications such as node classification. To solve this problem, in this paper we propose a novel network embedding framework, Discriminative Streaming Network Embedding (DimSim). When an edge insertion/deletion occurs, DimSim fast learns node embeddings incrementally, which is desired for many online applications such as anomaly detection. With the incremental learning method, at any time, the objective function well approximates to that of batch learning on the current snapshot. More importantly, the average amount of updating operations of DimSim for processing each newly coming edge is about O(n1−λ), where λ is constant between 1 and 2 and n is the number of nodes. As a result, DimSim can be increasingly fast as the network grows, which adheres to the intuition that an edge insertion/deletion has a smaller impact on larger networks. In addition, we also utilize node label information to learn discriminative node embeddings and we deliberately design a new model to automatically trade-off the node-specific importance of topology and label information. Extensive experiments on real-world streaming networks show that our method DimSim is up to 50 times faster and 40% more accurate than the state-of-the-arts.

Full Text