Network Representation Learning Algorithm Based on Complete Subgraph Folding

Dongming Chen,Dongqi Wang,Jiarui Yan,Qianqian Gan,Mingshuo Nie

doi:10.3390/math10040581

Abstract

Network representation learning is a machine learning method that maps network topology and node information into low-dimensional vector space. Network representation learning enables the reduction of temporal and spatial complexity in the downstream data mining of networks, such as node classification and graph clustering. Existing algorithms commonly ignore the global topological information of the network in network representation learning, leading to information loss. The complete subgraph in the network commonly has a community structure, or it is the component module of the community structure. We believe that the structure of the community serves as the revealed structure in the topology of the network and preserves global information. In this paper, we propose SF-NRL, a network representation learning algorithm based on complete subgraph folding. The algorithm preserves the global topological information of the original network completely, by finding complete subgraphs in the original network and folding them into the super nodes. We employ the network representation learning algorithm to study the node embeddings on the folded network, and then merge the embeddings of the folded network with those of the original network to obtain the final node embeddings. Experiments performed on four real-world networks prove the effectiveness of the SF-NRL algorithm. The proposed algorithm outperforms the baselines in evaluation metrics on community detection and multi-label classification tasks. The proposed algorithm can effectively generalize the global information of the network and provides excellent classification performance.

Highlights

With the development of deep learning techniques and increasing requirements for graph data mining, the study of Network Representation Learning (NRL) has attracted greater attention from scholars
In numerous real-world networks, we find that these networks commonly contain many complete subgraphs consisting of several nodes [5], and some are even connected by complete subgraphs and some common connection nodes
We assume that the community of the network is approximated by finding its complete subgraphs and we employ this approximate community information as the global topology information of the network to improve the performance of NRL

Summary

Introduction

With the development of deep learning techniques and increasing requirements for graph data mining, the study of Network Representation Learning (NRL) has attracted greater attention from scholars. There is a greater first-order similarity between nodes in the complete subgraph These nodes are closely connected and naturally form the structure of the community or become part of the community in the network. Existing algorithms fail to completely cover the global topology information of the network, which leads to information loss To solve this problem, we propose a Network Representation Learning Algorithm Based on Complete Subgraph Folding (SF-NRL). We first find the complete subgraphs in the network and fold the original network by treating the complete subgraphs as folding units and applying pre-defined folding rules We employ this method to find global topological information on the network and merge the representation of the original network with the coarsened network representation. (2) The algorithm enables the design of a graph-coarsening approach based on defined graph folding rules to obtain the global topological structure information of the network. (3) The experimental results on four network datasets show that the proposed algorithm can significantly improve the quality of node embeddings, and the effectiveness of the algorithm is demonstrated by community detection and multi-label classification tasks

Related Work

Community Detection

Multi-Label Classification