Abstract
Network embedding that encodes structural information of graphs into a low-dimensional vector space has been proven to be essential for network analysis applications, including node classification and community detection. Although recent methods show promising performance for various applications, graph embedding still has some challenges; either the huge size of graphs may hinder a direct application of the existing network embedding method to them, or they suffer compromises in accuracy from locality and noise. In this paper, we propose a novel Network Embedding method, NECL, to generate embedding more efficiently or effectively. Our goal is to answer the following two questions: 1) Does the network Compression significantly boost Learning? 2) Does network compression improve the quality of the representation? For these goals, first, we propose a novel graph compression method based on the neighborhood similarity that compresses the input graph to a smaller graph with incorporating local proximity of its vertices into super-nodes; second, we employ the compressed graph for network embedding instead of the original large graph to bring down the embedding cost and also to capture the global structure of the original graph; third, we refine the embeddings from the compressed graph to the original graph. NECL is a general meta-strategy that improves the efficiency and effectiveness of many state-of-the-art graph embedding algorithms based on node proximity, including DeepWalk, Node2vec, and LINE. Extensive experiments validate the efficiency and effectiveness of our method, which decreases embedding time and improves classification accuracy as evaluated on single and multi-label classification tasks with large real-world graphs.
Highlights
Networks are effectively used to represent relationships and dependence among data
We present an extension of our first method, NECL, that is a general meta-strategy for network embedding
Previous researchers consider the graph embedding as a dimensionality reduction (Chen et al, 2018a), such as PCA (Wold et al, 1987) that captures linear structural information and LE (Roweis and Saul, 2000) that preserves the global structure of non-linear manifolds
Summary
Networks are effectively used to represent relationships and dependence among data. Node classification, community detection, and link prediction are some of the applications of network analysis in many different areas such as social networks and biological networks. Recent approaches in graph representation learning focus on the scalable methods that use matrix factorization (Qiu et al, 2018; Sun et al, 2019) or neural networks (Tang et al, 2015; Cao et al, 2016; Tsitsulin et al, 2018; Ying et al, 2018) Many of these aim to preserve the first and second-order proximity as a local neighborhood with path sampling using short random walks such as DeepWalk and Node2vec (Hamilton et al, 2017; Cai et al, 2018; Cui et al, 2018; Goyal and Ferrara, 2018). While some studies use network embedding on node and graph classification (Perozzi et al, 2014; Niepert et al, 2016; Chen et al, 2018b), some others use it on graph clustering (Cao et al, 2015; Akbas and Zhao, 2019; Akbas and Zhao, 2017)
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.