A Contextual Dependency-Aware Graph Convolutional Network for extracting entity relations

Jiahui Liao,Yajun Du,Jinrong Hu,Hui Li,Xianyong Li,Xiaoliang Chen

doi:10.1016/j.eswa.2023.122366

Abstract

Dependency trees reflect rich structural information, which can effectively guide models to understand text semantics and are widely used for relation extraction. However, existing dependency-based models suffer from noise in dependency trees, which distorts the modeling of contextual information, particularly when introducing dependency types, further exacerbating the propagation and accumulation of errors. In addition, rule-based pruning strategies used to eliminate noise in dependency trees may lose crucial information for capturing relational patterns, whereas unconstrained learning-based pruning strategies may introduce more noisy edges. These two aspects of dependency trees may cause models to produce semantic confusion, which is detrimental to relation extraction. In this study, we propose a Contextual Dependency-Aware Graph Convolutional Network (C-DAGCN) that works on complete dependency trees and perceives the importance of word dependencies from multiple angles without considering the noise in dependency trees. This study includes three parts. First, the sequence model captures semantic information for the input sentence and its words. We considered the appropriate outputs as word and sentence-level representations based on model characteristics. Second, the structure model enhances the description of the sentence structure and its guidance. We consider that position knowledge implies associations between words; therefore, we propose the Current Word Global Relative Position Knowledge (CW-GRPK) to globally enhance text structure modeling and reduce the models’ overreliance on single noisy structure guidance. A dependency-aware module weighs heuristic syntactic knowledge and non-heuristic CW-GRPK to capture fine-grained word-level interactions that effectively distinguish words related to relational semantics. Subsequently, a dependency-guided module, based on the results of the dependency-aware layer, encodes a knowledge-enhanced word graph without losing global linguistic patterns. Third, we constructed high-level linguistic representations based on the results of the sequence and structure model in C-DAGCN as a relational representation and mapped it to the relational decision space for relational prediction. The experimental results demonstrate that our method outperforms nine strong baseline models on two benchmark datasets. Compared with the best baseline Attentive Graph Convolutional Networks (A-GCN) model (Tian et al., 2021), our proposed C-DAGCN model with 90.15 macro-average F1 and 79.52 micro-average F1 achieves 0.28 macro-average F1 and 0.47 micro-average F1 improvement on the SemEval2010-Task8 and ACE2005 datasets, respectively. 11Our code is available at https://github.com/dymyyc/RE_C-DAGCN. The released code when run produces results which are the same as those reported in this paper.

Full Text