Abstract

Recently, heterogeneous network representation learning has attracted a lot of attentions due to its potential applications. Our works in this paper are concentrated on how to leverage the output of network representation learning by combining with the topic similarity between nodes in content-based heterogeneous information network (CHIN). These unique challenges come from the shortage of topic similarity evaluation between text-based nodes which limit the accuracy of the similarity search as well other network mining tasks. Moreover, the massive sizes of current real-world network also raises challenges for traditional standalone-based heterogeneous network analysis models. Different from previous network representation learning models, such as: Node2Vec or Metapath2Vec, our proposed W-MethPath2Vec model uses the topic-driven meta-path-based random walk mechanism for generating heterogeneous neighborhood of nodes as the learning features. Then, these learning nodes’ features are used to train the learning model which is used for solving various heterogeneous network mining tasks such as: node similarity search, clustering, classification, link prediction, etc. The W-MethPath2Vec model enables the simultaneous modeling of structural and topic correlations between nodes in heterogeneous networks. Moreover, the W-MethPath2Vec model is implemented in the Apache Spark-based distributed framework which enables the capability of handling large-scaled networks. We tested our W-MethPath2Vec model with the previous state-of-the-art approaches in the real-world datasets to demonstrate the effectiveness of our proposed model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.