Network embedding plays a pivotal role in network analysis, due to the capability of encoding each node to a low-dimensional dense feature vector. However, most existing network embedding approaches only focus on preserving structural information in the network. The text features and category attributes of nodes are ignored, which are important to network analysis. In this paper, we propose an innovative semi-supervised network embedding (SNE) model integrating structural information, text features and category attributes into embedding vectors simultaneously. Specifically, we design a structure preserving module and a text representation module to capture the global structural information and the text features separately. Meanwhile, a label indicator matrix and a supervised loss are proposed for preserving category information and mapping nodes in the same class closer. We utilize stacked auto-encoders to explore the highly nonlinear characteristics of the network. By optimizing the reconstruction loss and the designed supervised loss jointly in the proposed semi-supervised model, the embedding vectors are finally learned. Extensive experiments on real-world datasets demonstrate that our method is superior to the state-of-the-art baselines in a variety of tasks, including visualization, node classification and clustering.
Read full abstract