Abstract

In this paper, we propose a system infrastructure to construct the big scholar data as a large knowledge graph, discover the meta paths between the entities and calculate the relevancy between entities in the graph. The core infrastructure is established on the secured and private Amazon Elastic Compute Cloud(Amazon EC2) platform. The infrastructure maintains the data evenly across the repositories, processes the data parallel by utilizing open source Spark framework, manages computing resources optimally by utilizing YARN and Hadoop HDFS, and discovers the relationship distributedly between different types of entities. We incorporate four relationship discovery tasks including citation recommendation, potential collaborator discovery, similar venue measurement and paper to venue recommendation on top of this infrastructure. For relationship mining tasks, we propose a mixed and weighted meta path (MWMP) method to explore the potential relationship between different types of entities. To verify the accuracy and measure parallelization speedup of our algorithm, we set up clusters through Amazon EC2 platform.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.