Abstract

Similarity search on heterogeneous information networks has attracted widely attention from both industrial and academic areas in recent years, for example, used as friend detection in social networks and collaborator recommendation in coauthor networks. The structure information on the heterogeneous information network can be captured by multiple meta paths and people usually utilized meta paths to design method for similarity search. The rich semantics in the heterogeneous information networks is not only its structure information, the content stored in nodes is also an important element. However, the content similarity of nodes was usually not valued in the existing methods. Although recently some researchers consider both of information in machine learning-based methods for similarity search, they used structure and content information separately. To address this issue by balancing the influence of structure and content information flexibly in the process of searching, we propose a double channel convolutional neural networks model for top-k similarity search, which uses path instances as model inputs, and generates structure and content embeddings for nodes based on different meta paths. Moreover, we utilize two attention mechanisms to enhance the differences of meta path for each node and combine the content and structure information of nodes for comprehensive representation. The experimental results showed our search algorithm can effectively support top-k similarity search in heterogeneous information networks and achieved higher performance than existing approaches.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call