Lime: Low-Cost and Incremental Learning for Dynamic Heterogeneous Information Networks

Hao Peng,Zheng Wang,Rajiv Ranjan,Philip S Yu,Renyu Yang,Lifang He,Jianxin Li,Albert Y Zomaya

doi:10.1109/tc.2021.3057082

Abstract

Understanding the interconnected relationships of large-scale information networks like social, scholar and Internet of Things networks is vital for tasks like recommendation and fraud detection. The vast majority of the real-world networks are inherently heterogeneous and dynamic, containing many different types of nodes and edges and can change drastically over time. The dynamicity and heterogeneity make it extremely challenging to reason about the network structure. Unfortunately, existing approaches are inadequate in modeling real-life dynamical networks as they either have strong assumption of a given stochastic process or fail to capture the heterogeneity of network structure, and they all require extensive computational resources. We introduce <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Lime , a better approach for modeling dynamic and heterogeneous information networks. <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Lime is designed to extract high-quality network representation with significantly lower memory resources and computational time over the state-of-the-arts. Unlike prior work that uses a vector to encode each network node, we exploit the semantic relationships among network nodes to encode multiple nodes with similar semantics in shared vectors. By using many fewer node vectors, our approach significantly reduces the required memory space for encoding large-scale networks. To effectively trade information sharing for reduced memory footprint, we employ the recursive neural network (RsNN) with carefully designed optimization strategies to explore the node semantics in a novel cuboid space. We then go further by showing, for the first time, how an effective incremental learning approach can be developed – with the help of RsNN, our cuboid structure, and a set of novel optimization techniques – to allow a learning framework to quickly and efficiently adapt to a constantly evolving network. We evaluate <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Lime by applying it to three representative network-based tasks, node classification, node clustering and anomaly detection, performing on three large-scale datasets. We compare <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Lime against eleven prior state-of-the-art approaches for learning network representation. Our extensive experiments demonstrate that <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Lime not only reduces the memory footprint by over 80 percent and the processing time over 2x when learning network representation but also delivers comparable performance for downstream processing tasks. We show that our incremental learning method can boost the learning time by up to 20x without compromising the quality of the learned network representation.

Full Text