Patent Entity and Relation Extraction (PERE) aims to extract entities and entity-relation triples from unstructured patent texts. PERE is one of the fundamental tasks in patent text mining, providing crucial technical support for patent retrieval and technology opportunity discovery. Previous works struggle to capture the implicit semantic information hidden within overlapping triples, especially a large number of overlapping triples existing in patent texts. A Patent Entity and Relation Extraction model based on Context query and Axial attention is proposed, named PERE-CA. As for entity recognition, the text segment is regarded as candidate entity span and entity types are acquired by span classification. Subsequently, the semantic context related to an entity pair is calculated by a context query method. And the semantic context is integrated into entity pair representation. For relation extraction, axial attention is implemented to get the implicit semantic information among overlapping entity pairs. And then, the model outputs all valid entity-relation triples. Experimental results on the patent dataset TFH-2020 and the public dataset SciERC demonstrate that the implementation of context query and axial attention can effectively improve extraction performance.
Read full abstract