Relational distance and document-level contrastive pre-training based relation extraction model

Yihao Dong,Xiaolong Xu

doi:10.1016/j.patrec.2023.02.012

Abstract

Document-level relation extraction has multi-entity and multi-mention compared to sentence-level, existing sentence-level relation extraction models cannot meet the requirements of document-level relation extraction. Existing graph-based document-level models usually design points and edges manually, which often introduces man-made noise; while the Transformer-based models cannot deeply solve the difficulties such as coreference resolution by designing pre-training tasks or other methods. In this paper, we propose a new Relational Distance and Document-level Contrastive Pre-training (RDDCP) based relation extraction model, which achieves coreference resolution by simple and effective mention replacement; we also introduce the concept of relational distance to achieve document-level contrastive pre-training, and find the most likely relational mention pairs from the plural mention pairs existing in the document-level dataset for contrastive learning; for the relation information in distant mentions ignored by the relational distance, we quantified the distances as weights and incorporated the information with weights into the embedding representation of entities. Each entity presents different entity embedding representations in different entity pairs. We conducted experiments on three popular datasets and the RDDCP model outperformed GAIN, SSAN and ATLOP as well as other baseline models in terms of performance and time complexity.

Full Text