Abstract

Abstract Although the fine-tuning pre-training model technique has obtained tremendous success in the domains of named entity recognition and relation extraction, realistic scenarios exist with many triples of nested entities and overlapping relations. Existing works focus on solving the overlapping triple problem where multiple relational triples in the same sentence share the same entity. In this work, we introduce a joint entity-relation extraction framework based on hybrid feature representation. Our framework consists of five primary parts: constructing hybrid feature representations, bidirectional LSTM encoder, head entity recognition module, entity type classification, and relation tail entity recognition. First, we fuse character-level vector and word-level vector representations via a max-pooling operation to enrich text feature information. Second, the hybrid feature representation is fed into a bidirectional LSTM to capture the correlation between characters and entities. Third, the head entity recognition module employs two identical binary classifiers to detect the start and end positions of entities separately. Then the entity type classification module filters out entities classified as non-entity types by softmax. Finally, we regard relation tail entity recognition as a machine reading comprehension task to eliminate the problem of entity overlap. Specifically, we regard the combination of head entities and relations as queries to query possible tail entities from the text. This framework efficiently handles the polysemy problem, considerably enhances knowledge extraction efficiency, and accurately extracts overlapping triples in domain texts with complicated relationships.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call