A deep learning based method benefiting from characteristics of patents for semantic relation classification

Liang Chen,Jing Zhang,Shuo Xu,Haiyun Xu,Guancan Yang,Lijun Zhu

doi:10.1016/j.joi.2022.101312

Abstract

• The unique characteristics of patents are highlighted via comparison. • A customized deep-learning model is proposed to leverage such characteristics. • The connection between entity pairs is measured on the basis of associated rules. • The proposed model outperforms prior models on semantic relation classification. The deep learning has become an important technique for semantic relation classification in patent texts. Previous studies just borrowed the relevant models from generic texts to patent texts while keeping structure of the models unchanged. Due to significant distinctions between patent texts and generic ones, this enables the performance of these models in the patent texts to be reduced dramatically. To highlight these distinct characteristics in patent texts, seven annotated corpora from different fields are comprehensively compared in terms of several indicators for linguistic characteristics. Then, a deep learning based method is proposed to benefit from these characteristics. Our method exploits the information from other similar entity pairs as well as that from the sentences mentioning a focal entity pair. The latter stems from the conventional practices, and the former from our meaningful observation: the stronger the connection between two entity pairs is, the more likely they belong to the same relation type. To measure quantitatively the connection between two entity pairs, a similarity indicator on the basis of association rules is raised. Extensive experiments on the corpora of TFH-2020 and ChemProt demonstrate that our method for semantic relation classification is capable of benefiting from characteristic of patent texts.

Full Text