Graph-Enhanced Prompt Learning for Cross-Domain Contract Element Extraction

Zihan Wang,Hanbing Wang,Pengjie Ren,Zhumin Chen,Maarten De Rijke,Zhaochun Ren

doi:10.1145/3715100

Zihan Wang, Hanbing Wang + Show 4 more

https://doi.org/10.1145/3715100

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Cross-domain contract element extraction (CEE) aims to transfer knowledge from a source domain to facilitate the extraction of legally relevant elements (e.g., contract dates or payments) from contracts in a target domain. To achieve this goal, recent studies encode the domain-invariant relations between elements and legal clause types, and enhance performance through bidirectional supervision between the CEE task and the clause classification task. However, two challenges remain unresolved: (i) data sparsity due to expensive annotation costs and a large number of element types, and (ii) label discrepancies among element types across domains, both of which severely impede effective knowledge transfer from the source to the target domain. Recent developments in prompt learning have shown promising performance in low-resource settings. Drawing inspiration from these advances, we propose a novel framework, graph-enhanced prompt learning (GEPL), for the cross-domain CEE task to address these challenges. GEPL includes two kinds of prompt: (i) instance-oriented prompts, and (ii) label-oriented prompts. Given the input instances, instance-oriented prompts are automatically generated by retrieving relevant examples in the training data, providing auxiliary supervision to enhance the transfer process in low-resource scenarios. To mitigate label discrepancies across different domains, we identify relations among element types using mutual-information criteria and transform these into label-oriented prompt templates. On this basis, a multi-task training strategy is designed to simultaneously optimize the representations of the original input sentence and prompts, enabling GEPL to better understand the tasks and capture label relations in both source and target domains. Empirical results on cross-domain CEE datasets indicate that GEPL significantly outperforms state-of-the-art baselines. Moreover, extensive experiments reveal that GEPL achieves state-of-theart performance on cross-domain named entity recognition datasets and demonstrates a high level of generalizability. Our code is released at https://github.com/WZH-NLP/GEPL .

Full Text