Heterogeneous information network (HIN) embedding aims to map heterogeneous nodes to the low-dimensional vector space. The existing embedding models cannot determine the optimal length of semantics automatically and reveal full semantic information adaptively for different heterogeneous networks. To address this challenge, an HIN embedding model with adaptive semantic mining is proposed. First, we project heterogeneous nodes into the same space and aggregate the features of target types in the first-degree range. Then, the semantics of different node types is combined through the attention mechanism, and latent meta-paths are mined using the attention coefficients. Finally, multiple feature aggregation layers are stacked with residual blocks. The residual weights control the proportion of semantics transferred between layers to aggregate more distal features selectively. In addition, we designed the Selected DropLink unit to remove links which transfer negative information, which can further improve the resistance of model to over-smoothing. Experiments show that our model can obtain more accurate embedding results and can automatically mine complex semantic connections between heterogeneous nodes without prior definition of meta-path and semantic depth.
Read full abstract