EPT: Data Augmentation with Embedded Prompt Tuning for Low-Resource Named Entity Recognition

Hongfei Yu,Rongkang Xu,Kunyu Ni,Yu Huang,Wenjun Yu

doi:10.1051/wujns/2023284299

Hongfei Yu, Rongkang Xu + Show 3 more

Open Access

PDF Available

https://doi.org/10.1051/wujns/2023284299

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Data augmentation methods are often used to address data scarcity in natural language processing (NLP). However, token-label misalignment, which refers to situations where tokens are matched with incorrect entity labels in the augmented sentences, hinders the data augmentation methods from achieving high scores in token-level tasks like named entity recognition (NER). In this paper, we propose embedded prompt tuning (EPT) as a novel data augmentation approach to low-resource NER. To address the problem of token-label misalignment, we implicitly embed NER labels as prompt into the hidden layer of pre-trained language model, and therefore entity tokens masked can be predicted by the finetuned EPT. Hence, EPT can generate high-quality and high-diverse data with various entities, which improves performance of NER. As datasets of cross-domain NER are available, we also explore NER domain adaption with EPT. The experimental results show that EPT achieves substantial improvement over the baseline methods on low-resource NER tasks.

Full Text