Abstract

With the increasing number of patents, the task of patent citation recommendation becomes more and more important and meaningful. The core of a patent is textual content, which describes technical ideas and scope of the protection. Therefore, most of the existing patent citation recommendation methods focus on textual content. Recently, some scholars introduce bibliographic (structural) data to construct heterogeneous information network (HIN) on the basis of textual content to further improve the effect of recommendation. Most of these methods build HIN directly based on candidate patents selected in advance according to text similarity, which simplifies the complexity of the problem. However, the latent semantic relationships between patents formed by the textual content are often ignored, which can be used to mine more semantic and structural relationships in HIN. In this paper, in order to capture deep semantic and structural information, we introduce Semantic based Heterogeneous Information Network Embedding (SHINE) which obtains the latent semantic relationships between patents (referred to as semantic links in this paper) by textual content similarity and topic similarity, and links the semantic relationships with network structure to build a novel HIN. First, to obtain semantic links, we not only consider the textual content similarity between patents but also the topic similarity. We use the linear fusion method to combine these two similarities. Secondly, we build a HIN with semantic links and bibliographic information to integrate semantic and structural information, and we use network embedding to map the integration of two kinds of information into a common vector space. Finally, we recommend patent citations by linear combination of multi-modal similarities. The experimental results on two U.S. patent datasets show the effectiveness of SHINE. Compared to state-of-the art of patent citation recommendation, the proposed method SHINE achieves significantly better results in terms of Average-Precision (AP), Area Under Curve (AUC) and Recall.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call